This library contains two functions to get rechtspraak data and metadata from the API.
Python 3.9+
|
Pranav Bapat |
running-machin |
Piotr Lewandowski |
shashankmc |
gijsvd |
pip install rechtspraak_extractor
get_rechtspraak
Gets all the ECLIs and saves them in the CSV file or in-memory.
get_rechtspraak_metadata
Gets the metadata of the ECLIs created by above function and saves them in the new CSV file or in-memory.
fetch_eclis_via_sqlite
Low-level function to look up a list of ECLIs directly from a local SQLite database and return a DataFrame.
Requires the
It gets, ECLI, title, summary, updated date, link.
Link attribute that we get from the above function contains the links of ECLI metadata.
It gets instantie, datum uitspraak, datum publicatie, zaaknummer, rechtsgebieden, bijzondere kenmerken, inhoudsindicatie, and vindplaatsen.
Supports two extraction methods:
method='api' (default, fetches live from Rechtspraak API)
and method='sqlite' (fetches from a local pre-built SQLite database — see below).
rechtspraak-lido-sqlite package to be installed and its database populated first
(see SQLite method below).
- get_rechtspraak(max_ecli=100, sd='2022-05-01', ed='2022-10-01', save_file='y') Parameters:
- max_ecli: int, optional Maximum amount of ECLIs to retrieve
- sd: date, optional, default '2022-08-01' The start publication date (yyyy-mm-dd)
- ed: date, optional, default current date The end publication date (yyyy-mm-dd)
- save_file: ['y', 'n'], default 'y' y - Save data as a CSV file in data folder
- get_rechtspraak_metadata(...)
- save_file: ['y', 'n'], default 'n' y - Save data as a CSV file in data folder
- dataframe: dataframe, optional Dataframe containing ECLIs to retrieve metadata. Cannot be combined with filename
- filename: string, optional CSV file containing ECLIs to retrieve metadata. Cannot be combined with dataframe
- method: ['api', 'sqlite'], default 'api' api - Fetch metadata live from the Rechtspraak API
- sqlite_db_path: string, default 'data/lido_metadata.db' Path to the SQLite database file. Only used when
- fallback_to_api: bool, default True When using
- multi_threading: bool, default True Use multi-threading for API-based metadata extraction. Set to False for single-threaded execution
- fetch_eclis_via_sqlite(ecli_list, sqlite_db_path, columns)
- ecli_list: list[str] List of ECLI identifiers to look up
- sqlite_db_path: string Path to the SQLite database file produced by
- columns: list[str] Column names to select from the database
Default: 100
n - Save data as a dataframe in-memory
n - Return data as a dataframe in-memory
sqlite - Fetch metadata from a local SQLite database (requires
rechtspraak-lido-sqlite)
method='sqlite'
method='sqlite', fall back to the live API for any ECLIs not found in the database
rechtspraak-lido-sqlite
import rechtspraak_extractor as rex
# Get rechtspraak data as a DataFrame (100 ECLIs since 2022-08-01)
df = rex.get_rechtspraak(max_ecli=100, sd="2022-08-01", save_file="n")
# Save rechtspraak data directly to CSV in the data/ folder
rex.get_rechtspraak(max_ecli=100, sd="2022-08-01", save_file="y")# Get metadata into a DataFrame from an existing DataFrame
df_metadata = rex.get_rechtspraak_metadata(save_file="n", dataframe=df)
# Get metadata into a DataFrame from a CSV produced by get_rechtspraak
df_metadata = rex.get_rechtspraak_metadata(save_file="n", filename="rechtspraak.csv")
# Produce metadata CSV from an in-memory DataFrame
rex.get_rechtspraak_metadata(save_file="y", dataframe=df)
# Produce metadata CSV from files already in data/ (processes all files)
rex.get_rechtspraak_metadata(save_file="y")filenamerefers to a file in thedata/folder created byget_rechtspraak.dfis the DataFrame returned byget_rechtspraak.
The SQLite method fetches metadata from a local pre-built database instead of making live API calls. This is significantly faster for large batches and works offline.
Prerequisite: The rechtspraak-lido-sqlite package must be installed and its database must be built locally before using this method.
pip install rechtspraak-lido-sqliteAfter installing, follow the rechtspraak-lido-sqlite instructions to build the local database (typically produces a file at data/lido.db or a path you configure).
import rechtspraak_extractor as rex
df = rex.get_rechtspraak(max_ecli=500, sd="2025-01-01", save_file="n")
# Fetch metadata from local SQLite database
df_metadata = rex.get_rechtspraak_metadata(
save_file="n",
dataframe=df,
method="sqlite",
sqlite_db_path="data/lido.db", # path to the database built by rechtspraak-lido-sqlite
fallback_to_api=True, # fall back to live API for ECLIs not found in the DB
)from rechtspraak_extractor.rechtspraak_metadata import fetch_eclis_via_sqlite
eclis = ["ECLI:NL:HR:2023:1", "ECLI:NL:HR:2023:2"]
columns = ["ecli", "document_type", "date_decision", "instance", "full_text"]
df = fetch_eclis_via_sqlite(
ecli_list=eclis,
sqlite_db_path="data/lido.db",
columns=columns,
)Note: If the database file does not exist or an ECLI is not found in it,
fetch_eclis_via_sqlitereturns an empty DataFrame rather than raising an error. Usefallback_to_api=Trueinget_rechtspraak_metadatato automatically cover missing ECLIs via the live API.
Previously under the MIT License, as of 28/10/2022 this work is licensed under a Apache License, Version 2.0.
Apache License, Version 2.0
Copyright (c) 2022 Maastricht Law & Tech Lab
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.