Indexes points and lines and generates map tiles to display them

Related tags

Utilities datamaps
Overview

Datamaps

This is a tool for indexing large lists of geographic points or lines and dynamically generating map tiles from the index for display.

Dependencies

  • Modern C compiler like gcc or clang
  • make
  • libpng
  • Ideally a 64 bit machine with >8 GB free memory

Installation

First install make and libpng then type:

make

After the build finishes you will have 4 new command line programs available in the local directory:

encode render enumerate merge

Usage

The basic idea is that if you have a file of points like this:

40.711017,-74.011017
40.710933,-74.011250
40.710867,-74.011400
40.710783,-74.011483
40.710650,-74.011500
40.710517,-74.011483

or segments like this:

40.694033,-73.987300 40.693883,-73.987083
40.693883,-73.987083 40.693633,-73.987000
40.693633,-73.987000 40.718117,-73.988217
40.718117,-73.988217 40.717967,-73.988250
40.717967,-73.988250 40.717883,-73.988433
40.717883,-73.988433 40.717767,-73.988550

you can index them by doing

cat file | ./encode -o directoryname -z 16

to encode them into a sorted quadtree in web Mercator in a new directory named directoryname, with enough bits to address individual pixels at zoom level 16.

You can then do

./render -d directoryname 10 301 385 

to dump back out the points that are part of that tile, or

./render directoryname 10 301 385 > foo.png

to make a PNG-format map tile of the data. (You need more data if you want your tile to have more than just one pixel on it though.)

Alternately, if you want an image for a particular area of the earth instead of just one tile, you can do

./render -A -- directoryname zoom minlat minlon maxlat maxlon > foo.png

The "--" is because otherwise getopt will complain about negative numbers in the latitudes or longitudes. For example you could use

./render -A -- dots.dm 12 37.192596 -122.811526 38.070528 -121.702961 > sf.png

to generate an image of the San Francisco Bay Area at zoom level 12 from the encoded data in dots.dm.

The point indexing is inspired by Brandon Martin-Anderson's Census Dotmap. The vector indexing is along similar lines but uses a hierarchy of files for vectors that fit in different zoom levels, and I don't know if anybody else does it that way.

Rendering assumes it can mmap an entire copy of the file into the process address space, which isn't going to work for large files on 32-bit machines. Performance, especially at low zoom levels, will be much better if the file actually fits in memory instead of having to be swapped in.

Merging files

encode will only write to a brand new file. If you want to add data to an existing file, the way to do it is to create a new file with encode and then use merge to combine the old and the new.

$ cat newdata | encode -o new.dm
$ merge -o combined.dm old.dm new.dm

merge also has an option, -u, to eliminate duplicates between the source files while merging them.

Generating a tileset

The enumerate and render programs work together to generate a tileset for whatever area there is data for. If you do, for example,

$ enumerate -z14 dirname | xargs -L1 -P8 ./render -o tiles/dirname

enumerate will output a list of all the zoom/x/y combinations that appear in dirname through zoom 14, and xargs will invoke render on each of these to generate the tiles into tiles/dirname.

You can enumerate a single zoom by specifying both -z and -Z for maximum and minimum. So if you want just z12, enumerate -z12 -Z12.

The -P8 makes xargs invoke 8 instances of render at a time. If you have a different number of CPU cores, a different number may work out better.

If you want to filter the output of render, for example through pngquant to reduce the number of colors, you can do it by having xargs invoke a subshell.

$ enumerate -z8 dirname | xargs -L1 -P8 sh -c 'mkdir -p tiles/dirname/$2/$3; render $1 $2 $3 $4 | pngquant 32 > tiles/dirname/$2/$3/$4.png' dummy

The dummy argument is important because sh -c eats the first argument after the command.

Adding color to data

The syntax for color is kind of silly, but it works, so I had better document it.

Colors are denoted by distance around the color wheel. The brightness and saturation are part of the density rendering; the color only controls the hue.

If you want to have 256 possible hues, that takes 8 bits to encode, so you need to say

encode -m8

to give space in each record for 8 bits of metadata. Each input record, in addition to the location, also then needs to specify what color it should be, and the format for that looks like

40.711017,-74.011017 :0
40.710933,-74.011250 :85
40.710867,-74.011400 :170

to make the first one red, the second one green, and the third one blue. And then when rendering, you do

render -C256

to say that it should use the metadata as 256ths of the color wheel.


Options to render

Input file, zoom level, and bounds

The basic form is

render dir zoom x y

to render the specified tile into a PNG file on the standard output.

-A ... dir zoom minlat minlon maxlat maxlon
Instead of rendering a single tile (zoom/x/y), the invocation format changes to render the specified bounding box as a single image.
-f dir
Also read input from dir in addition to the file in the main arguments. You can use this several times to specify several input files.

Output file format

-d
Output plain text (same format as encode uses) giving the coordinates and metadata for each point or line within the tile.
-D
Output GeoJSON giving the coordinates and metadata for each point or line within the tile.
-T pixels
Image tiles are pixels pixels on a side. The default is 256. 512 is useful for high-res "retina" displays.
-r
Leaflet-style retina, where a request for a tile at zoom level N is actually a request for a quarter of a tile at zoom level N-1. In this case, the quarter-tiles remain 256x256.
-o dir
Instead of outputting the PNG image to the standard output, write it in a file in the directory dir in the zoom/x/y hierarchy. It will also write a basic dir/metadata.json that will be used if you package the tiles with mbutil.

Background

-t opacity
Changes the background opacity. The default is 255, fully opaque.
-w
The default background color becomes white, not black.
-b hex
Specifies hex to be the background color. The default is black (or white, if -w is set.)
-m
Makes the output image a mask: The data areas are transparent and the background is opaque. The default is the opposite.

Color

-c hex
Specifies hex to be the fully saturated color at the middle of the output range. The default is gray.
-S hex
Specifies hex to be the oversaturated color at the end of the output range. The default is white.
-s
Use only the color range leading up to full saturation. The default treats saturated color as the middle of the range and allows the output to be oversaturated all the way to white (or the -S color).

Brightness and thickness

-B base:brightness:ramp
Sets the basic display parameters:
  • Base is the zoom level where each point is a single pixel. The default is 13.
  • Brightness is the value contributed by each dot at that zoom level. The default is 0.05917. With the default (square root) gamma, this means it takes 4 dots on the same pixel to reach full color saturation and 16 to reach full oversaturation. (It should have been 0.0625 so that it would hit it exactly.)
  • Ramp is the an additional brightness boost given to each dot as zoom levels get higher, or taken away as zoom levels get lower, slightly increasing the effect of halving the number of dots with each zoom level. The default is 1.23.
-e exponent
Allows specifying a different rate at which dots are dropped at lower zoom levels. The default is 2, and anything much higher than that will look terrible at low zoom levels, and anything much lower will be very slow at low zoom levels. 1.5 seems to work pretty well for giving a quality boost to the low zoom levels. The ramp from -B is automatically adjusted to compensate for the change.
-G gamma
Sets the gamma curve, which causes each additional dot plotted on the same pixel to have diminishing returns on the total brightness. The default is 0.5, for square root.
-L thickness
Sets the base thickness of lines. The default is 1, for a single pixel thickness.
-l ramp
Sets the thickness ramp for lines. The line gets thicker by a factor of ramp for each zoom level beyond the base level from -B. The default is 1, for constant thickness. Thicker lines are drawn dimmer so that the overall brightness remains the same.
-p area
Specifies a multiplier for dot sizes. Point brightness is automatically reduced by the same factor so the total brightness remains constant, just diffused. The default is 1. (Example -p5 for area 5)
-p garea
Specifies a Gaussian brush instead of a flat disk, as well as a multiplier for dot sizes. (Example: -pg5 for Gaussian with area 5)

Metadata

-C hues
Interpret the metadata as one of hues hues around the color wheel. Numbering starts at 0 for red and continues through orange, yellow, green, blue, violet, and back to red.
-C meta1:hue1:meta2:hue2
Specify a range of hues that correspond to a domain of meta values. The hues are numbered in degrees: 0 for red, 30 for orange, 60 for yellow, 120 for green, 180 for cyan, 240 for blue, 300 for violet. You can specify hues below 0 or above 360 to wrap around across red.
-x cradiusf / -x cradiusm
Interpret the metadata as a number of points to be plotted in the specified radius (in feet or meters) around the point in the data.
-x b
Make the brightness of each feature proportional to the metadata value.
-x r
Make the radius of each point proportional to the metadata value.
-x smax
Cap the saturation of meta colors at max instead of 0.7. They will go all the way to white if you use 1.
-x u
Use an approximation of CIELCH uniform color space so that all hues with the same density will have approximately equal lightness and saturation. Blues will be brighter and greens will be dimmer.

Compensation

-g
Reduce the brightness of lines whose endpoints are far apart, to compensate for GPS samples that jump around, or bogus connections to to 0,0.
-O base:dist:ramp
Tune the parameters for reasonable distances between points:
  • Base is the zoom level at which only fully acceptable samples are given full brightness. The default is 16.
  • Dist is the allowable distance between samples at the base zoom level. The unit is z32 tiles, or about 1cm, and I need to make that something more human-oriented. The default is 1600.
  • Ramp is the factor of additional distance that is allowed at each lower zoom level. The default is 1.5.
-M latitude
Mercator compensation. Makes the dots bigger at latitudes higher than the one specified and smaller at latitudes closer to the equator.

Vector styling

-v
Instead of the normal output, produce a CartoCSS file for TileMill 2 to approximate the brightness, dot ramp, gamma, and colors you specified. Use render-vector to make the vector tiles themselves.

Useless

-a
Turn off anti-aliasing
Comments
  • Add readme instructions for stitching tilesets

    Add readme instructions for stitching tilesets

    After a tileset has been created with enumerate and render could you add instructions for some next steps? e.g. how could one stitch/combine all the individual tiles to form one large image?

    opened by umaar 6
  • Tile count estimate

    Tile count estimate

    I was generating a tileset for a single zoom level (15) and actually ran out of inodes!

    Command: enumerate -z15 -Z15 dirname | xargs -L1 -P8 ./render -o tiles/dirname

    To ensure the machine I'm using is suitable, how can I figure out the total file count (not size) for zoom level 15?

    opened by umaar 3
  • Option to splat points with transparency

    Option to splat points with transparency

    I've been trying to approximately reproduce a vector-based map I created. In the original, I splat dots at 40% opacity, and with color based on a timestamp:

    screenshot 2014-06-30 10 35 08

    Alas, we're working with millions of points now and the svg approach has become unscalable, so I am looking toward raster solutions and found this repo (which is amazing, by the way). It'd be great to have the ability to splat points with a specified alpha in datamaps -- but looking through the code I'm not really sure where to start. I've tried some post-processing approaches, but can't find a good mapping from color to alpha. For the best so far I've had to resort to single color and mapping greyscale directly to alpha, which leaves large square grey regions around each point:

    screenshot 2014-06-30 17 15 00

    Thoughts?

    opened by Meekohi 3
  • merging input data to index

    merging input data to index

    It's not clear in the documentation if the encode tool will merge data to the index if a call it pointing to an existing index. Is that the case?

    I tried to read the source code, but I guess my C skills are a little bit rusty.

    Thanks for the great project.

    opened by marcioaguiar 3
  • Enumerate a single zoom level?

    Enumerate a single zoom level?

    Any way of enumerating over just a single zoom level? I'm just doing slight brightness and base changes on each zoom level manually to get more control than ramp allows.

    Right now I'm just enumerating through everything up to a zoom level and then throwing away everything I don't need, which is wasteful and slow.

    opened by aaronlidman 3
  • [Question] Readme lacking information?

    [Question] Readme lacking information?

    EDIT: Please delete the question. I just had to do "make" to compile everything using the makefile.

    ORIGINAL:

    I have almost no knowledge of C at all , and I am trying to follow the Readme in order to make a map as cool as yours.

    so right now, i've followed these steps:

    1.clone this repo git clone https://github.com/ericfischer/datamaps.git 2. cd datamaps

    1. mkdir output_folder

    2. copy in the /datamaps folder a txt file following this structure:

      lat,lon lat,lon lat,lon ...

      cat file.txt | ./encode -o output_folder -z 16

    however, i get the following error.

    -bash: ./encode: No such file or directory
    

    I guessed it is because encode.c is not compiled.

    Then I tried compiling it doing: gcc encode.c -o encode

    and that throws another error:

    /tmp/ccTwkJ7r.o: In function `read_file':
    encode.c:(.text+0x589): undefined reference to `latlon2tile'
    encode.c:(.text+0x750): undefined reference to `bytesfor'
    encode.c:(.text+0x822): undefined reference to `xy2buf'
    encode.c:(.text+0x87a): undefined reference to `xy2buf'
    encode.c:(.text+0x8df): undefined reference to `meta2buf'
    /tmp/ccTwkJ7r.o: In function `main':
    encode.c:(.text+0x12c1): undefined reference to `bytesfor'
    encode.c:(.text+0x12d3): undefined reference to `gSortBytes'
    encode.c:(.text+0x163d): undefined reference to `bufcmp'
    collect2: error: ld returned 1 exit status
    

    Any help on what I am doing wrong and how to make the maps would help me a lot!

    opened by manugarri 3
  • Error on Mavericks

    Error on Mavericks

    When compiling render:

    undef: ___sincos_stret Undefined symbols for architecture x86_64:

    The solution was to run xcode-select --install which I found here: https://github.com/mxcl/homebrew/issues/22553

    I'm only using the xcode CLI tools, maybe that's it?

    opened by aaronlidman 3
  • Time based colouring

    Time based colouring

    Hey @ericfischer - assume that for each data point, I have a percentage for how far into the day it occurred. What could I feed into the encode script to have it consider this for colouring? I'd like to end up with something where I can see that data points within a certain geographic area happened early in the day, whereas data points within another area happened later into the day. Ideas are appreciated, thank you!

    opened by umaar 2
  • Area documentation

    Area documentation

    The docs state

    -p area Specifies a multiplier for dot sizes. Point brightness is automatically reduced by the same factor so the total brightness remains constant, just diffused. The default is 1. -p garea Specifies a Gaussian brush instead of a flat disk, as well as a multiplier for dot sizes.

    Because both are -p, it's not particularly clear how to specify a Gaussian brush.

    opened by pnorman 2
  • License and version

    License and version

    Hi Eric We've written a specfile for datamaps to aid in our deployment on Centos and Fedora based machines and it'd be handy if we could assign values for license and version number to it. We'd be happy to submit the specfile for inclusion via a PR too.

    Thank you for your time.

    opened by deanwilson 2
  • better support for building against libpng 1.6.x  with pkg-config

    better support for building against libpng 1.6.x with pkg-config

    I have libpng 1.5 installed in /usr/local and 1.6 (required by this code) in a custom /opt location. This fix makes it easier to build against 1.6 without tweaking the makefile by just passing PKG_CONFIG_PATH=/opt/libpng-1.6/lib/pkgconfig (on my system).

    opened by springmeyer 2
  • Specifying color for data points in large file

    Specifying color for data points in large file

    @ericfischer Is there an option to specify a color code for a huge dataset using encode? In my case, there is a .osm file for buildings in SF Peninsula and I want the output gif from osm-animate to be shown in some bright color.

    opened by ramyaragupathy 2
  • need render-vector documentation

    need render-vector documentation

    I see that render-vector is a task in the Makefile, but when I add it to the all task, I get an error. Probably because I don't have all the TM2 dependencies installed - but not sure which ones.

    On a related note - a million thanks for open-sourcing your secret sauce. This is a really cool piece of code.

    opened by gabrielflorit 2
  • Addition of a build system generator

    Addition of a build system generator

    I suggest to reuse a higher level build system than your current small make file so that powerful checks for software features will become easier.

    opened by elfring 7
  • Completion of error handling

    Completion of error handling

    I have looked at a few source files for your current software. I have noticed that some checks for return codes are missing.

    Would you like to add more error handling for return values from functions like the following?

    opened by elfring 3
Owner
Eric Fischer
Eric Fischer
A tool that analyzes headers and generates introspection code

A tool that analyzes headers and generates introspection code

Crax 2 Nov 7, 2021
A simple application that generates animated BTTV emotes from static images

emoteJAM WARNING! The application is in active development and can't do anything yet. A simple application that generates animated BTTV emotes from st

Tsoding 7 Apr 27, 2021
The Sandboxed API project (SAPI) Generates sandboxes for C/C++ libraries automatically

The Sandboxed API project (SAPI) makes sandboxing of C/C++ libraries less burdensome: after initial setup of security policies and generation of library interfaces, a stub API is generated, transparently forwarding calls using a custom RPC layer to the real library running inside a sandboxed environment.

Google 1.6k Dec 28, 2022
The lightweight and modern Map SDK for Android and iOS

Open Mobile Maps The lightweight and modern Map SDK for Android (6.0+) and iOS (10+) openmobilemaps.io Getting started Readme Android Readme iOS Featu

Open Mobile Maps 95 Dec 23, 2022
CommonMark parsing and rendering library and program in C

cmark cmark is the C reference implementation of CommonMark, a rationalized version of Markdown syntax with a spec. (For the JavaScript reference impl

CommonMark 1.4k Jan 4, 2023
libcurses and dependencies taken from netbsd and brought into a portable shape (at least to musl or glibc)

netbsd-libcurses portable edition this is a port of netbsd's curses library for usage on Linux systems (tested and developed on sabotage linux, based

null 124 Nov 7, 2022
Sqrt OS is a simulation of an OS scheduler and memory manager using different scheduling algorithms including Highest Priority First (non-preemptive), Shortest Remaining Time Next, and Round Robin.

A CPU scheduler determines an order for the execution of its scheduled processes; it decides which process will run according to a certain data structure that keeps track of the processes in the system and their status. A process, upon creation, has one of the three states: Running, Ready, Blocked (doing I/O, using other resources than CPU or waiting on unavailable resource).

Abdallah Hemdan 18 Apr 15, 2022
A LKM rootkit targeting 4.x and 5.x kernel versions which opens a backdoor that can be used to spawn a reverse shell to a remote host and more.

Umbra Umbra (/ˈʌmbrə/) is an experimental LKM rootkit for kernels 4.x and 5.x (up to 5.7) which opens a network backdoor that spawns reverse shells to

Marcos S. Bajo 93 Dec 10, 2022
A cross-platform OpenXR capabilities explorer and runtime switcher with a CLI and GUI.

OpenXR Explorer OpenXR Explorer is a handy debug tool for OpenXR developers. It allows for easy switching between OpenXR runtimes, shows lists of the

Nick Klingensmith 154 Dec 13, 2022
Simple and lightweight pathname parser for C. This module helps to parse dirname, basename, filename and file extension .

Path Module For C File name and extension parsing functionality are removed because it's difficult to distinguish between a hidden dir (ex: .git) and

Prajwal Chapagain 3 Feb 25, 2022
Compile and execute C "scripts" in one go!

c "There isn't much that's special about C. That's one of the reasons why it's fast." I love C for its raw speed (although it does have its drawbacks)

Ryan Jacobs 2k Dec 26, 2022
A shebang-friendly script for "interpreting" single C99, C11, and C++ files, including rcfile support.

c99sh Basic Idea Control Files Shebang Tricks C++ C11 Credits Basic Idea A shebang-friendly script for "interpreting" single C99, C11, and C++ files,

Rhys Ulerich 100 Dec 3, 2022
A tool for use with clang to analyze #includes in C and C++ source files

Include What You Use For more in-depth documentation, see docs. Instructions for Users "Include what you use" means this: for every symbol (type, func

null 3.2k Jan 4, 2023
SMACK Software Verifier and Verification Toolchain

SMACK is both a modular software verification toolchain and a self-contained software verifier. It can be used to verify the assertions in its input p

null 393 Dec 10, 2022
CommonMark spec, with reference implementations in C and JavaScript

CommonMark CommonMark is a rationalized version of Markdown syntax, with a spec and BSD-licensed reference implementations in C and JavaScript. Try it

CommonMark 4.7k Jan 1, 2023
A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.

libpostal: international street address NLP libpostal is a C library for parsing/normalizing street addresses around the world using statistical NLP a

openvenues 3.6k Dec 27, 2022
A small self-contained alternative to readline and libedit

Linenoise A minimal, zero-config, BSD licensed, readline replacement used in Redis, MongoDB, and Android. Single and multi line editing mode with the

Salvatore Sanfilippo 3.1k Dec 30, 2022
WAFer is a C language-based software platform for scalable server-side and networking applications. Think node.js for C programmers.

WAFer WAFer is a C language-based ultra-light scalable server-side web applications framework. Think node.js for C programmers. Because it's written i

Riolet Corporation 693 Dec 6, 2022
tiny recursive descent expression parser, compiler, and evaluation engine for math expressions

TinyExpr TinyExpr is a very small recursive descent parser and evaluation engine for math expressions. It's handy when you want to add the ability to

Lewis Van Winkle 1.2k Dec 30, 2022