- Joined
- Jan 8, 2019
- Messages
- 56,609
- Solutions
- 2
- Reputation
- 32
- Reaction score
- 100,454
- Points
- 2,313
- Credits
- 32,610
6 Years of Service
76%

Aleph is a tool for indexing large amounts of both documents (PDF, Word, HTML) and structured (CSV, XLS, SQL) data for easy browsing and search. It is built with investigative reporting as a primary use case. Aleph allows cross-referencing mentions of well-known entities (such as people and companies) against watchlists, e.g. from prior research or public datasets.
Here are some key features:
Web-based search across large document and data sets.
Imports many file formats, including popular office formats, spreadsheets, email and zipped archives. Processing includes optical character recognition, language and encoding detection and named entity extraction.
Load structured entity graph data from databases and CSV files. This allows navigation of complex datasets like companies registries, sanctions lists or procurement data. Import tools for OpenSanctions. are included.
Receive notifications for new search matches with a personal watchlist.
OAuth authorization and access control on a per-source and per-watchlist basis.
Changelog v3.12.7 RC1
Bump urllib3 from 1.26.10 to 1.26.11 by @dependabot in #2417
Bump followthemoney from 3.0.2 to 3.0.3 by @dependabot in #2418
removed code smell of function args with mutable defaults by @brassy-endomorph in #2436
Bump normality from 2.3.3 to 2.4.0 by @dependabot in #2430
Switch to rabbitmq based task queue by @sunu in #2199
Remove the public access disabled message by @sunu in #2408
Fix timelines editor by @tillprochaska in #2457
Bump sqlalchemy from 1.4.39 to 1.4.40 by @dependabot in #2455
Make the hide filter button persistent by @Rosencrantz in #2459
Bump jsonschema from 4.7.2 to 4.9.1 by @dependabot in #2443
Quickfix: Set the max_analyzed_offset setting to prevent ES errors by @sunu in #2474
Display highlights in document plaintext view mode by @tillprochaska in #2388
Bump react-dropzone from 11.7.1 to 14.2.2 in /ui by @dependabot in #2375
Bump axios from 0.25.0 to 0.27.2 in /ui by @dependabot in #2250
Bump yaml from 1.10.2 to 2.1.1 in /ui by @dependabot in #2291
Bump urllib3 from 1.26.11 to 1.26.12 by @dependabot in #2479
Fix activation screen by @tillprochaska in #2506
Fix date helper by @tillprochaska in #2490
Till/2400 message banner by @tillprochaska in #2421
Bump uuid from 8.3.2 to 9.0.0 in /ui by @dependabot in #2510
Bump authlib from 0.15.5 to 1.1.0 by @dependabot in #2518
Bump jsonschema from 4.9.1 to 4.16.0 by @dependabot in #2508
Bump sqlalchemy from 1.4.40 to 1.4.41 by @dependabot in #2503
Bump black from 22.6.0 to 22.8.0 by @dependabot in #2493
Update pyjwt requirement from <2.5.0,>=2.0.1 to >=2.0.1,<2.6.0 by @dependabot in #2522
Rollback RabitMQ based task queue changes by @sunu in #2536
change contact to new form by @jlstro in #2541
Bump servicelayer[amazon,google] from 1.20.0 to 1.20.4 by @dependabot in #2537
Bump ftm and ftm-compare by @Rosencrantz in #2544
added ability to set custom settings via env vars by @brassy-endomorph in #2538
Bump servicelayer[amazon,google] from 1.20.4 to 1.20.5 by @dependabot in #2545
To see this hidden content, you must like this content.