Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data source: FOI requests (Woo-besluiten) #145

Open
4 tasks
vanderburgt opened this issue Oct 31, 2023 · 3 comments
Open
4 tasks

Data source: FOI requests (Woo-besluiten) #145

vanderburgt opened this issue Oct 31, 2023 · 3 comments
Assignees
Labels
user story Describes a new feature or requirement

Comments

@vanderburgt
Copy link
Collaborator

User Story

As a Bron user,
I want to access and search through completed Freedom of Information (FOI) requests and their attached documents published by governments (in Dutch: "Woo-besluiten"),
so that I can access valuable information and insights obtained through FOI requests and enhance my research and understanding of government activities.

Acceptance Criteria

  1. The "Woo-besluiten" data source should be integrated into the product's search functionality, alongside existing data sources.
  2. Users should be able to search for FOI requests and related documents within the "Woo-besluiten" data source.
  3. Search results from the "Woo-besluiten" data source should be presented in a clear and user-friendly manner.
  4. Users should be able to view detailed information about each FOI request, including request details, government entities involved, and related documents.
  5. The "Woo-besluiten" data source should be available as a filter under the "more options" (in Dutch: "meer opties") section.
  6. The data source should be regularly updated (daily?) to ensure the latest FOI requests and documents are available.
  7. Search results should provide a link to the original FOI source.

Additional Information

  • Possible data sources: Woogle, Open.overheid.nl, Woo-index.

Tasks

  • Research different data sources and their structures to determine their suitability for Bron.
  • Research methods for implementing the data sources into Bron.
  • Test data quality and search performance.
  • Design and implement the user interface for searching and viewing FOI requests from "Woo-besluiten."
@vanderburgt vanderburgt added the user story Describes a new feature or requirement label Oct 31, 2023
@vanderburgt
Copy link
Collaborator Author

vanderburgt commented Nov 20, 2023

  • Research Woogle dataset availability (download links and update frequency)
  • Run partial import to test data

@breyten
Copy link
Member

breyten commented Jan 16, 2024

Woogle:

  • Update frequency: daily (?) -- municipalities have number of documents so easy to check if something is updated so we can crawl the index more often if necessary.
  • Type of data: json/pdf

Method of crawling:

  • First crawl https://woogle.wooverheid.nl/overview (json version of this page results in a timeout)
  • Then get the nl.gm(\d+) codes and crawl the pages as follows: https://doi.wooverheid.nl/?doi=nl.gm0518&infobox=true
    • foi_files field points to resource, but they can be pretty much anything. Most common is pdf/html

@breyten
Copy link
Member

breyten commented Jan 16, 2024

Open.overheid.nl

  • Only ministries, not municipalities

Method of crawling

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
user story Describes a new feature or requirement
Projects
Status: Testing 🐳
Development

No branches or pull requests

3 participants