31 August 2012

Guide to creating maps with Stata

Most charts and maps on this site were created with the Stata statistical software package. This guide explains how maps like those with adult and youth literacy rates in 2010 can be created with Stata. The article supersedes an earlier version from 2005 and introduces updated maps with current country borders. For example, South Sudan, which seceded from Sudan in 2011, is shown as a separate country on the new maps. The instructions below are for Stata version 9 or later. Users of Stata 8 are referred to the guide from 2005. The creation of maps is not supported in older versions of Stata.

Requirements

  • Stata version 9.2 or later.
  • spmap: Stata module for drawing thematic maps, by Maurizio Pisati. spmap can be installed in Stata with this command:
      ssc install spmap
  • shp2dta: Stata module for converting shapefiles to Stata format, by Kevin Crow. shp2dta can be installed in Stata with this command:
      ssc install shp2dta
  • Shapefile: A shapefile is a data format for geographic information systems. For the maps in Figures 1 and 2, please download this public domain shapefile from Natural Earth:
    ne_110m_admin_0_countries.zip (184 KB, world map with country borders, scale 1:110,000,000)
  • Note: The instructions are accurate for Natural Earth maps version 2.0.0, the most recent version of the maps available in April 2014.

Step 1: Convert shapefile to Stata format

  • Unzip ne_110m_admin_0_countries.zip to a folder that is visible to Stata, for example the current working directory of Stata. The archive contains six files:
    ne_110m_admin_0_countries.dbf
    ne_110m_admin_0_countries.prj
    ne_110m_admin_0_countries.shp
    ne_110m_admin_0_countries.shx
    ne_110m_admin_0_countries.README.html
    ne_110m_admin_0_countries.VERSION.txt
  • Start Stata and run this command:
      shp2dta using ne_110m_admin_0_countries, data(worlddata) coor(worldcoor) genid(id)
  • Two new files will be created: worlddata.dta (with the country names and other information) and worldcoor.dta (with the coordinates of the country boundaries).
  • If you plan to superimpose labels on a map, for example country names, run the following command instead, which adds centroid coordinates to the file worlddata.dta:
      shp2dta using ne_110m_admin_0_countries, data(worlddata) coor(worldcoor) genid(id) genc(c)
  • Please refer to the spmap documentation to learn more about labels.
  • The DBF, PRJ, SHP, and SHX files are no longer needed and can be deleted.

Step 2: Draw map with Stata

  • Open worlddata.dta in Stata.
  • For the example maps, create a variable with the length of each country's name. The Stata command for this is:
      generate length = length(admin)
  • Draw a map that indicates the length of all country names with this command:
      spmap length using worldcoor.dta, id(id)
  • The default map (Figure 1) is grayscale, it shows Antarctica, there are four classes for the length of the country names, the legend is very small, and the legend values are arranged from high to low.

Figure 1: Length of country names (small scale map, default style)

Click image to enlarge.

  • A second map without Antarctica, with a blue palette, five classes, and with a bigger legend with values arranged from low to high (Figure 2) can be drawn with this command:
      spmap length using worldcoor.dta if admin!="Antarctica", id(id) fcolor(Blues) clnumber(5) legend(symy(*2) symx(*2) size(*2)) legorder(lohi)
  • Darker colors on the map indicate longer country names, ranging from 4 (for example Cuba and Fiji) to 35 characters (French Southern and Antarctic Lands).
  • Please read the Stata help file for spmap to learn about the many additional options for customization of maps.

Figure 2: Length of country names (small scale map, blue palette)

Click image to enlarge.

Alternative maps with more detail

The shapefile that was used for Figures 1 and 2 was designed for small maps. It contains the borders for 177 countries and territories and does not include smaller geographic units like Hong Kong, Monaco, or St. Vincent and the Grenadines. As an alternative to the small scale map in Figures 1 and 2, Natural Earth offers shapefiles with more detail that were designed for larger maps.

  • To create the map in Figure 3, download this shapefile from Natural Earth, which has information for 241 countries and territories:
    ne_50m_admin_0_countries.zip (799 KB, world map with country borders, scale 1:50,000,000)
  • Unzip ne_50m_admin_0_countries.zip to a folder that is visible to Stata.
  • Run this Stata command to convert the shapefile to Stata format:
      shp2dta using ne_50m_admin_0_countries, data(worlddata2) coor(worldcoor2) genid(id)
  • If you need Stata files with centroids, run this command instead:
      shp2dta using ne_50m_admin_0_countries, data(worlddata2) coor(worldcoor2) genid(id) genc(c)
  • Open worlddata2.dta in Stata.
  • Create a variable with the length of each country's name:
      generate length = length(admin)
  • Draw the map in Figure 3:
      spmap length using worldcoor2.dta if admin!="Antarctica", id(id) fcolor(Blues) clnumber(5) legend(symy(*2) symx(*2) size(*2)) legorder(lohi)
  • The map takes longer to draw than the map in Figures 1 and 2 because it is more detailed and shows more geographic units. The names of the countries and territories on the map have a length up to 40 characters (South Georgia and South Sandwich Islands).

Figure 3: Length of country names (medium scale map)

Click image to enlarge.

  • To create the map in Figure 4, download this shapefile from Natural Earth, which has information for 255 countries and territories, including small islands like the Ashmore and Cartier Islands:
    ne_10m_admin_0_countries.zip (5.1 MB, world map with country borders, scale 1:10,000,000)
  • Unzip ne_10m_admin_0_countries.zip to a folder that is visible to Stata.
  • Run this Stata command to convert the shapefile to Stata format:
      shp2dta using ne_10m_admin_0_countries, data(worlddata3) coor(worldcoor3) genid(id)
  • If you need Stata files with centroids, run this command instead:
      shp2dta using ne_10m_admin_0_countries, data(worlddata3) coor(worldcoor3) genid(id) genc(c)
  • Open worlddata3.dta in Stata.
  • Create a variable with the length of each country's name:
      generate length = length(ADMIN)
  • Draw the map in Figure 4:
      spmap length using worldcoor3.dta if ADMIN!="Antarctica", id(id) fcolor(Blues) clnumber(5) legend(symy(*2) symx(*2) size(*2)) legorder(lohi)
  • The map takes longer to draw than the maps in Figures 1, 2 and 3 because it has the largest amount of detail. The differences between the maps in Figures 3 and 4 can be seen by clicking on the images to enlarge them. Figure 4 has more islands and more detailed shorelines. The names of the countries and territories on the map in Figure 4 have a length up to 40 characters (South Georgia and South Sandwich Islands).

Figure 4: Length of country names (large scale map)

Click image to enlarge.

Software used in this guide

  • Stata: statistical software package
  • spmap: Stata module for drawing thematic maps, by Maurizio Pisati
  • shp2dta: Stata module for converting shapefiles to Stata format, by Kevin Crow
  • ne_110m_admin_0_countries.zip: small scale (1:110,000,000) Natural Earth world map with country borders (184 KB)
  • ne_50m_admin_0_countries.zip: medium scale (1:50,000,000) Natural Earth world map with country borders (799 KB)
  • ne_10m_admin_0_countries.zip: large scale (1:10,000,000) Natural Earth world map with country borders (5.1 MB)

Related articles

External links

Friedrich Huebler, 31 August 2012 (edited 18 April 2015), Creative Commons License
Permanent URL: http://huebler.blogspot.com/2012/08/stata-maps.html

32 comments:

  1. brilliant tutorial on something visually lovely. would love to see more of these!

    ReplyDelete
  2. is there an easy way to just display europe, as opposed to specifying each of the countries you don't want displaying?

    thanks.

    ReplyDelete
  3. You can use a shapefile that only has European countries. A Google search turns up several free shapefiles for Europe.

    ReplyDelete
  4. Cheers for the tutorial.

    I have the somewhat of the same problem.
    When using the world map "10m-admin-0-countries" and entering a couple of countries spmap crops itself on minimal size (as it did, when you excluded the "Antarctica"). However, countries like "France" make the map grow global, because all the old collonies are included. Is there a way to tell stata/spmap to focus/crop to a specific area?

    ReplyDelete
  5. It is possible to focus on a specific area with the plotregion(margin()) option. This can be demonstrated with a map from Natural Earth.

    First, download the countries dataset from the 1:110m Cultural Vectors page (direct link, 184 KB).

    Unzip the downloaded file and convert the map to Stata format with this command: shp2dta using ne_110m_admin_0_countries, data(worlddata) coor(worldcoor) genid(id)

    Open the file worlddata.dta in Stata and generate a variable with the length of country names with this command: generate length = length(admin)

    The following command creates a map of the entire world: spmap length using worldcoor.dta, id(id)

    This command creates a map of France, Portugal and Spain that also shows the French area in South America: spmap length using worldcoor.dta if admin=="France" | admin=="Portugal" | admin=="Spain", id(id)

    We can now limit this map to Europe by adding the plotregion(margin()) option to the command above: spmap length using worldcoor.dta if admin=="France" | admin=="Portugal" | admin=="Spain", id(id) plotregion(margin(-240 0 -170 0))

    The appropriate margins can be found through trial and error. The command above adds a negative margin on the left side and on the bottom of the map, which means that the map is cropped on those sides. For additional information see the Stata documentation (help region_options).

    ReplyDelete
  6. Hi Huebler,

    It seems to download the shapefile from Natural Earth should be http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/110m/cultural/ne_110m_admin_0_countries.zip

    Regards,
    Yangki

    ReplyDelete
  7. Yangki, thank you for informing me that the links had changed. I updated the article and all links, where necessary. The instructions now apply to the Natural Earth maps version 2.0.0, the most recent version available in April 2014.

    ReplyDelete
  8. Can anyone recommend any shapefiles that only have European countries. I agree google search identifies several but they are not all as easy to use as the natural earth file in this example. Links would be greatly appreciated.

    ReplyDelete
  9. Greg, you can use shapefiles from Natural Earth and draw a map that is limited to Europe. See my comment of 28 August 2013 for an example using Stata.

    ReplyDelete
  10. Thanks Friedrich this worked well.

    Is there anyway to change the colour of the ocean ?

    Greg

    ReplyDelete
  11. Greg, you can change the color of the ocean with the plotregion option. For example, to change the color of the ocean in Figure 1 from the default white to blue, use this command:

    spmap length using worldcoor.dta, id(id) plotregion(fcolor(blue))

    ReplyDelete
  12. Is it possible to get rid of the small islands without deleting countries like Malta?

    I would like to creat a map from the EU and the small islands are a little irritating.

    Thanks in advance!

    ReplyDelete
  13. Sebastian, you may be able to suppress small islands with the if condition in Stata. See the example for Figure 2, which uses if to hide Antarctica from the map. As an alternative you could use the 1:110m map from Natural Earth (shown in Figures 1 and 2), but that wouldn't include Malta.

    ReplyDelete
  14. Thank you for the tutorial.

    I would like to generate a map where countries that meet some condition, e.g. having a particular law, are shaded, while countries that do not, remain colorless.

    I have a binary variable (0's and 1's) in my dataset, but when I use it as the "attribute," Stata thinks the 0's are another "class," and shades them light gray, while shading the 1's as dark gray. How can I leave the 0's unshaded while shading the 1's?

    Thank you!

    ReplyDelete
  15. I am afraid this question goes beyond my knowledge of spmap. I would recommend that you ask your question on Statalist.

    ReplyDelete
  16. Hi Friedrich,

    Thanks for the tutorial.

    I am trying to create a map of the world with different colours for each country.

    I typed in this code:
    spmap length using worldcoor2.dta if admin=="Malaysia", id(id) fcolor(Blues) clnumber(5) legend(symy(*2) symx(*2) size(*2)) legorder(lohi)

    And got an error code:
    nquantiles() must be less than or equal to number of observations plus one

    Could you please assist? Thanks!

    ReplyDelete
  17. Nina, you wrote that you are trying to create a map of the world but your command draws only a map with one country, Malaysia. If you select more countries you shouldn't get the error message you reported. Example:

    spmap length using worldcoor2.dta if admin=="Malaysia" | admin=="Indonesia" | admin=="Brunei" | admin=="Timor-Leste" | admin=="Papua New Guinea", id(id) fcolor(Blues) clnumber(5) legend(symy(*2) symx(*2) size(*2)) legorder(lohi)

    ReplyDelete
  18. Thanks a lot!

    Would it be possible to use a different colour for a each country?

    ReplyDelete
  19. Nina, you can assign a different color to each country. The attribute (the variable "length" in the example in my previous comment) would have to contain unique values for each country and in the fcolor() option you would have to specify about 200 different colors, one for each country on the map. See help spmap for additional information.

    ReplyDelete
  20. Thank you! I appreciate your help.

    ReplyDelete
  21. Hi Huebler,

    Thank you for the tutorial, helps a lot.

    I want to change my legend label to appear shorter than the actual data, for example, only presenting it in 2 digits decimal.

    What command option would work? thanks!

    ReplyDelete
  22. Please see help legend_options in Stata, especially the section "Content suboptions for use with legend() and plegend()".

    ReplyDelete
  23. Hi,
    excuse me for my ignorance in this matter, i just recently started to use stata. can you give me a example of a visible folder?
    When i use the following command:
    "shp2dta using ne_110m_admin_0_countries, data(worlddata) coor(worldcoor) genid(id)"
    error r(601) appears...

    Kind regards,
    John

    ReplyDelete
  24. Dear Huebler,

    I am struggling to get off the touchline.

    I am intending to publish a list of point coordinates in the ocean (which I have as latitude and longitude columns in my dataset).

    However, when trying to follow the example's first line,
    'shp2dta using... etc. (...) gen(id)' I get the error
    file already exists.

    What I am doing wrong? I have one single dataset where all the latitude and longitude data is present.

    ReplyDelete
  25. Dear Friedrich,
    First of all let me thank you for sharing your maps and teaching the community how to do them.
    Basically, I want to draw a map fusing two sets of colors as the index I want to represent goes from -1 to 1. So I want a color for the negative countries, another color for the positive ones. Also, I was wondering if you know how to fix the range each shade of color has. For example, I want one degree of blue going from 0 to 0.25, another a bit stronger, from 0.26 to 0.50 and so on.
    Any help will be strongly appreciated,
    Best regards,
    David

    ReplyDelete
  26. I modified the article to make clearer what is meant by a "folder that is visible to Stata". An example for such a folder is the current working directory of Stata, shown in the lower left corner of the Stata window. The Stata command pwd also displays the name of the current working directory.

    ReplyDelete
  27. The command shp2dta using ne_110m_admin_0_countries, data(worlddata) coor(worldcoor) genid(id) leads to a "file already exists" error message if the files worlddata.dta and/or worldcoor.dta already exist in the same folder. To avoid the error, there are three options: (a) delete the files worlddata.dta and worldcoor.dta before running the command, (b) use different names in the data() and coor() options, or (c) use the replace option of the shp2dta command (see help shp2dta).

    ReplyDelete
  28. Regarding the colors, in the examples in my article the color is determined by the variable "id". You can assign values to this variable based on an another variable, for example replace id = 5 if indicator>=0 & indicator<0.26 (where "indicator" is the variable the colors in the map refer to). Then, you use the fcolor option of spmap to assign colors in the order of the values in the "id" variable. For example, if "id" has 8 different values (from 1 to 8) and 5 represents 0 to 0.25, then the fcolor option would list 8 different colors.

    I cannot go into more detail here but this is all documented in help spmap. Quote from the spmap help file: "fcolor(colorlist) specifies the list of fill colors of the base map polygons. When no choropleth map is drawn, the list should include only one element. On the other hand, when a choropleth map is drawn, the list should be either composed of k elements, or represented by the name of a predefined color scheme."

    In the examples in the article I use fcolor with the Blues color scheme but I could also have specified five different colors.

    ReplyDelete
  29. Hi, please let me know, how I can show country name on the map. My another question is that how I can draw regional map, for instance, South Asia? I would like to give you thanks in advance.

    ReplyDelete
  30. Hi, Please let me know how I can show country's name in the place of the country and how I can draw regional diagram, for instance, South Asia, in the map.

    I would like to give you thanks in advance.

    ReplyDelete
  31. Labels are described in detail in the spmap help file. Enter the command help spmap in Stata and then go to the section describing the "Option label() suboptions". The help file also contains examples that show how labels can be added to a map.

    You have two options if you want to draw a regional map, for example for South Asia. You can either use a shapefile for the region, or you can limit the map to specific countries with the help of if. The second approach is used in the examples in the article to exclude Antarctica from the map (compare Figure 1 and Figure 2).

    ReplyDelete
  32. Thanks a lot!!!!

    ReplyDelete