Skip to content
This repository has been archived by the owner on May 5, 2022. It is now read-only.

ISSUE-766 | Added new attribute method #767

Merged
merged 1 commit into from
Apr 19, 2020

Conversation

macieg
Copy link
Contributor

@macieg macieg commented Apr 18, 2020

No description provided.

@iandees
Copy link
Member

iandees commented Apr 18, 2020

Have you tested this with real data? I'm surprised that we'd get multiple values for a single column back like that.

@macieg
Copy link
Contributor Author

macieg commented Apr 18, 2020

@iandees
It's not about columns in csv files - it's about nodes in xml files like described here -
#766

@macieg
Copy link
Contributor Author

macieg commented Apr 19, 2020

@iandees to be more specific -
Some time ago I've updated the cache with polish addresses, because it was outdated.
Now I'd like to get rid of that cache and use frequently updated source of data.

It requires adding this additional attribute method. Apart from this PR I'll need also to make changes in the main repository and documentation.

Maybe I'm wrong, but this is a place where I should start?

I'm happy to give more detailed explanation if needed :)

@iandees
Copy link
Member

iandees commented Apr 19, 2020

I understand what the source data looks like, but I'm pretty sure that as that data works its way through our pipeline it will lose multiple values and end up with a single string, not an array of strings. This is why I had asked if you tried this change outside of the unit test.

@macieg
Copy link
Contributor Author

macieg commented Apr 19, 2020

@iandees - I've tried.

I was doing some experiments with other task

openaddresses/openaddresses@b14251c

If we take a look at the resulting file, we'll see:

LON,LAT,NUMBER,STREET,UNIT,CITY,DISTRICT,REGION,POSTCODE,ID,HASH
15.9878829,54.0127764,30,Kochanowskiego,,Białogard,"['Polska', 'zachodniopomorskie', 'białogardzki', 'Białogard']","['Polska', 'zachodniopomorskie', 'białogardzki', 'Białogard']",78-200,PL.ZIPIN.1422.EMUiA_05e9f97a-c860-43ff-b7f1-e6fcd229a7c3,48b297bfd8452a34
15.9832874,54.008211,9,Ludowa,,Białogard,"['Polska', 'zachodniopomorskie', 'białogardzki', 'Białogard']","['Polska', 'zachodniopomorskie', 'białogardzki', 'Białogard']",78-200,PL.ZIPIN.1422.EMUiA_05f08ec1-6eaf-484a-aca7-6d0f69c234de,0fe4030acab1328a
15.9705208,54.0123932,14,Królowej Jadwigi,,Białogard,"['Polska', 'zachodniopomorskie', 'białogardzki', 'Białogard']","['Polska', 'zachodniopomorskie', 'białogardzki', 'Białogard']",78-200,PL.ZIPIN.1422.EMUiA_05f68238-99d8-4990-a21d-41cd2cabbf85,bf6b1980190d2eaa
16.0040907,54.012632,2,Gryfitów,,Białogard,"['Polska', 'zachodniopomorskie', 'białogardzki', 'Białogard']","['Polska', 'zachodniopomorskie', 'białogardzki', 'Białogard']",78-200,PL.ZIPIN.1422.EMUiA_060b6d7e-c2a1-401e-8eb2-eada4e0483c7,9aeb025523996d9

"['Polska', 'zachodniopomorskie', 'białogardzki', 'Białogard']"

There are quotes around, but it looks to me like it was considered as an array before printing to the file.

Am I wrong?

I haven't tried to do it on my local computer. I can do if needed. :)

@iandees
Copy link
Member

iandees commented Apr 19, 2020

I think that might be the text coming out of OGR. But let's try it and see what happens!

@iandees iandees merged commit 714c830 into openaddresses:master Apr 19, 2020
@macieg
Copy link
Contributor Author

macieg commented Apr 25, 2020

@iandees - you were right, it was just a string :/
Fix below, not the prettiest code :)

#768

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants