Geographical Data
- Major Roads and Intersections in ASCII format.
There are two files: majorroads.txt describes
every major road and ferry crossing in the unites states. This is a 1.3MB text file,
so it may take a while to appear. Each line describes one
road, and there are 47,014 lines.
Here are three sample lines from the file:
CO-17 T-- 19536 19455 0.593
I-85 L-- 16586 16593 0.012
US-56 P-- 16598 16595 0.005
The five items on each line are the road's name (a variable length string), its official
designation (three character string, "T--" means Through Highway, "L--" means Limited
Access Highway, "F-T" means Toll Ferry Crossing, etc.), then two integers indicating
the locations that it links together, then a floating point number giving its length(*).
Most roads are divided into a large number of short segments.
The integers indicating the locations linked by the road are simply line numbers
(starting from zero) in the locations file.
(*) The units for road length are degrees of the Earth's circumference, which come to
almost exactly 69 miles. So a length recorded as 0.100 is really 6.9 miles.
The locations.txt file has one line for each location
mentioned in the majorroads file. This is a 1.5MB text file, so it may take a while
to appear. Each line begins with the longitude and latitude
of the location in degrees, followed by some information intended to help in choosing
a name for it. The third item is the number of miles from the nearest place that has
an official name; the rest of the line describes that nearest named place, giving
its population, what kind of place it is ("city", "town", etc), the two
letter abbreviation for its state, and its name. The name may include spaces, it
extends to the end of the line.
The closests.txt file has one line for each place
described in the alphaplaces file. Here are the first few lines from the file:
7795 7.69723
25892 0.296369
25031 0.0485809
This means that the closest location (in the locations file) to the first entry
in alphaplaces is location 7795, and it is 7.7 miles away.
The closest location (in the locations file) to the second entry
in alphaplaces is location 25892, and it is 0.29 miles away, etc.
The closests file has the same number of lines as the alphaplaces file.
There is now a second data set available, containing only interstate highways in
the East (longitude > -90) and the locations connected to them. These files are
about one eighth of the size of the original files, so a quadratic algorithm should
be about 64 times faster processing them. The files are
ielocations.txt and
ieroads.txt.
For quick testing, a third data set exists, containing less precise data but for
a wider range of eastern cities and roads. The files for this reduced data set are
fewlocations.txt and
fewroads.txt.
Additionally, there is a fourth data set available, containing only interstate highways in
the Florida and the locations connected to them. These files are not very interesting,
but they are very small, so even the most inefficient program should be able to
work with them quickly enough.
The files are
ifllocations.txt and
iflroads.txt.
- alphaplaces.txt
A plain text file containing official data about
every named place in the United States. The file is about 4MB long, and contains 25375 entries.
The file is sorted alphabetically on the place names.
Each entry occupies one line and has a fixed format. It is arranged strictly in columns, which
are not always separated by spaces. The first two letters on a line are the postal abbreviation
for the state. They are followed by a numeric code, then the name of the place. The rest of each
line is filled with numbers. One of the numbers is the population. The last two numbers are
latitude and longitude (negative numbers indicate South of the equator or West of the Greenwich
Meridian).
Similar files can be found here.
- High Resolution state and US boundaries, binary format.
There are 52 files in this directory: one for each
of the states, one for D.C., and one for the continental U.S. They contain very
high resolution descriptions of the outlines of the states and all islands within
state waters; some of the files, especially the one for Alaska, are very big.
The format is quite simple, and is described fully here.
- Digital Elevation Model files in ASCII format.
Each file is a simple text files containing
integers separated by spaces or newlines. The first three numbers in the file specify the
number of rows (R), then the number of columns (C), then the special "marine code" (M).
R and C do not specify how the file is formatted, but how it is to be viewed. After these
three numbers there follows exactly R*C numbers, giving the elevation above sea level (int metres)
of an individual point. The data is stored in row-order, so the first C numbers give the
elevations of the Northernmost row of points. Negative values indicate a point that is below
sea level. The special "marine code" M indicates a point that is "out at sea"; it is
not an altitude or the depth of the sea bed, simply a code to represent "not land".
The files are named to indicate the region that they cover, most BUT NOT ALL have 600 rows
and 600 columns. One example should suffice: usaW125N50D10.txt covers a rectangular
region whose top left (N.W.) corner is at longitude 125 West and Latitude 50 North; its
longest side is 10 Degrees, so if it happens to be square, it will cover the region
between longitudes 125 and 115 West and latitudes 40 and 50 North. The following files
are currently available:
- usaW125N50D60.txt, the whole country
- usaW125N50D20.txt, the Western part of the country
- usaW125N50D10.txt, the whole North-Western region
- usaW125N50D5.txt, mostly Washington state
- usaW95N50D10.txt, the great lakes and Northern midwest
- usaW95N35D10.txt, Louisiana, Mississippi, Alabama
- usaW90N35D10.txt, the South-East
- usaW80N45D10.txt, the North-East
There is also a simple viewer for these files as a Windows executable.
- Digital Elevation Model files in Binary format.
These files contain the same information as the ascii files, but in a compressed
binary format, so processing them is much faster and more efficient. There are also
many more of them, covering the whole country at each of the 5 degree, 10 degree,
20 degree, 30 degree, 60 degree, and 80 degree tile sizes.
For a tile that is R pixels high and C pixels wide, the file is exactly
2*(R+1)*C bytes long. The factor of two is because the data is in the "short int"
format: signed 16 bit numbers in little-endian order. The factor of R+1 instead of
R is because there is an extra dummy row at the beginning of the file. This dummy
row does not contain elevation data, but "meta-data": information about the
file itself, written as an ascii string. The R*C two-byte data items that follow
the dummy row are elevations in metres above sea level.
A typical dummy row contains this information:
rows 600 columns 600 bytesperpixel 2 secondsperpixel 60
leftlongseconds -450000 toplatseconds 180000 min 1 max 4048 specialval -500
In each file the strings are exactly the same and in the same order, but the numbers
may be different. The example indicates that the format of the file is 600 rows (R) and
600 columns (C) and 2 bytes per data value (short ints). The resolution is
60 seconds (i.e. one minute or 1/60 degree) per data value, so an entire row of
600 values will cover ten degrees geographically. The top left hand corner is
at a longitude of -450000 seconds (which is 125 degrees West) and a latitude of
+180000 seconds (which is 50 degrees North). The lowest real data value is 1
(metre above sea level) and the highest is 4048. The value -500 is the special
"marine code": when -500 appears in the data it is not an elevation, but an
indication that this point is out at sea, off the coast.
The files may be found here
These are some good samples for calibration:
Within thhe same directory, there is also a file called coverage.txt.
This text file has one line for each of the data files, each line gives the exact area covered
by a file. One example line is- 25 20 -100 -95 usaW100N25D5.dat
This
indicates that the file usaW100N25D5.dat covers the area between latitudes 20 to 25 degrees North
and longitudes 95 to 100 West.
Here is a windows executable viewer for the
binary files.