Goals and Objectives:
The goal of this lab was to geocode a set of data that was given to us by the DNR. This data was not formatted correctly, making it tedious to format each one. We used this data to gain the understanding of how normalization is key to providing data. It also shows the importance of accuracy of data.
Methods:
Before anything could be done in ArcMap I had to normalize the data that was given to me (Figure 1). I had to add new columns to the data to separate out different parts from different columns to make it work with the geocoder. By doing this, I was able to successfully able to use the geocoder for a few of the addresses. Sadly, most of mine were only in Public Land Survey System (PLSS) notation making it difficult to use the geocoder, therefore I had to manually locate the mines myself by using the description given to me in the address on the excel file.
Once all of the addresses were normalized (Figure 2) I was able to upload the table into ArcMap and use the geocoding tool. I had to set up the geocoding tool by selecting which columns went to what on the geocoder. For example, on the geocoder addresses went with addresses. Esri's The geocoder then matched the addresses using the "World Geocoding Service." The addresses did not come out exactly accurate. This meant that I had to go into each point and assure they were in the correct spot with the sand mine. I noticed in the once that I was specifically assigned to geocode that a few did not have the sand mine in the ESRI base map that I was using. I then had to find it on google and locate it back on the base map using the roads around it. This made it slightly more difficult, but using the PLSS helped to narrow it down even farther. I used the address inspector to unmatch the points and choose the correct point by using the "Pick Address From Map" tool.
Finally, I was then able to compare the addition of my colleagues results, my results and the actual locations of the mines. To do this I was able to use the "merge tool," which we recently learned in a demo, to combine the data together. Next, I used the "Near" tool to calculate the closest distance of my points compared to my colleagues points.
Results:
When I first went to use the geocoder I had some of the addresses not showing up, put in the middle of the town, and not located anywhere near a street. This was not the result I was hoping for, but expected it because I had very few with actually addresses and needed find the location of the PLSS to find the sand mines.
Figure 3: A map showing the Merged Geocoding and the actual mine locations. |
I then used the Distance to Point tool, which allowed myself to figure out the distance between my mines locations and my classmates (figure 4). This was interesting because I thought the numbers were going to be much higher than they actually were, but ended up being much closer.
Figure 4: The distance results between my mine locations and my classmates. |
Discussion:
There are so many reasons why there is such a different in the distance between the geocoded points. In a class reading by Lo they are listed out in a table. I do not believe there to be any gross errors in this data because they are all assumed as sand mine locations. However, there may be systematic errors found within the data because not all of the mines were shown on the base map making it harder to be as accurate as possible. There even may be some random errors which could occur from just making a mistake or possibly just moving a point that was not supposed to be moved. This lab however was a great lesson in learning how the different errors can really change the outcome of the data; and how important it is to have accurate data when making real life decisions that are based on data that I have made/used.
The Inherent errors in the data are actually a large source of the errors in the geocodings. These are errors in digitizing the data. The fact that each dataset was created differently by each of my classmates really throws the data for a loop. If it was only one person then only that person has to worry about the format, but with a whole class the format is constantly changing. Operation errors were also a large part of this lab because many students may have used the data incorrectly or made a mistake when using the data.
To figure out which points are correct compared to the ones that are not correct I would want a full list of addresses of each mine, but without this information we could go over the data as a group to figure out which of the points are correct, and which ones are not. having the latitude and longitudinal data is also very helpful, making it easier to find the location on the map.
Conclusion:
Being able to normalize data, understand the data, and accurately use the data is extremely beneficial when geocoding. Have a set of rules would beneficial to the data; this would make it easier to use and having the guidelines would allow a smaller chance of error.
References Cited:
Lo, C., & Yeung, A.
(2003). Data Quality and Data Standards. In Concepts and Techniques in
Geographic Information Systems (pp. 104-132). Pearson Prentice Hall.
No comments:
Post a Comment