Heurist


Heurist is an Open Source online database builder and CMS publisher designed for Humanities research data and collections, including data on people, organisations, places, events, artefacts, documents, media, bibliographic records, contemporary stories and other data which is rich in text and classification data, richly interlinked, and often heterogeneous.
Heurist was originally designed by Ian Johnson and developed by the Arts eResearch unit at the University of Sydney. It continues to be actively developed within the Faculty of Arts and Social Sciences. Free web services for building research databases are available at https://heuristplus.sydney.edu.au/ and . New Heurist servers can be set up using installation packages downloadable from the project web site.
Heurist was developed to overcome three problems identified as common to researchers in the Humanities :
  • the technical expertise required to set up rich heterogeneous databases with relationships between entities, and to publish data selectively to the web
  • the fragmentation of research data across many separate poorly-connected or incompatible databases
  • problems of sustainability due to the ad hoc nature of custom database development requiring individual maintenance of each database
It aims to tackle these issues by:
  • providing a web service supporting the on-demand creation, management and population of new databases through a web interface, and the creation of CMS web sites embedded directly in the databases which have direct access to the database content.
  • allowing the storage and interlinking of a wide variety of research data, notes, annotations and digital attachments in a single shared database, while providing individual ‘views’ on this data and workgroup-owned and private areas for research in progress.
  • centralised update and maintenance of thousands of databases, and automatic update of database formats by newer software versions to ensure backward compatibility. Data can also be dumped in a reloadable archival format.

    Methodology

Heurist is written in PHP and JavaScript, on top of a fixed MySQL/MariaDB data structure. Entities/record types, fields, vocabularies and terms are defined through data within the database rather than being hardcoded in the software or database structure. Heurist uses a key-value pair approach linked to a primary data table instantiating typed entities, allowing variant data structures and repeating value fields with maintained order. Relationships between entities are implemented as record pointer fields and Relationship Marker fields.
Heurist has the following field types, all of which can have multiple cardinality:
  • Numeric
  • Text
  • Term lists
  • Date / time fields
  • Geographic
  • Pointer fields allowing lookup of another record in the database
  • Relationship marker fields allowing the creation of typed, constrained, directional, dated and annotated relationships between records
  • File fields - uploaded to server or remote files referenced through a URL
Heurist provides several modes of data visualisation and export based on filtered subsets of the database: export in CSV, JSon, XML, KML, GeoJSon, GEFX for Gephi, IIIF manifests; tabular listing; user-defined reporting using Smarty; interactive maps and timelines ; simple network diagrams; crosstabulation. Widgets for these visualisations can be embedded in the CMS website generated from the database, or in standalone web pages or iframes in an external website.
Databases can be populated through form-based data entry, CSV import via a wizard which matches existing records and normalises data by extracting and linking entities based on selected columns, Zotero bibliography synchronisation, KML import, media uploads and indexing.
Other functions include wizards to build simple or facetted searches, personal and shared saved searches, search expansion rules to pull in related records, workgroup ownership of records, group notifications, blogging, a bookmarklet for capturing web references, WYSIWYG formatted text, user and workgroup tags.
For developers there is an API and all the export formats are available as live feeds. XML output can be transformed through XSLT stored in records within the database. Heurist source code is available under GNU GPL from the GitHub repository at https://github.com/HeuristNetwork/heurist and can be installed on any LAMP server, including virtual servers in the Research cloud, Amazon AWS and virtual servers from most ISPs. It has also been successfully installed on Windows servers.

Applicability

Heurist was conceived as a digital knowledgebase for managing heterogeneous data with rich interlinking, in small to medium collections, often rich in media, textual and categorisation data, such as those typically found in the Arts and Humanities, and in personal research spaces. It is not suitable for large, structured, homogeneous, numerical datasets typical of the Sciences.
Heurist allows management of information with spatial and temporal components. Spatial components include the ability to enter georeferenced points, polygons etc. directly into an editor, as well as the ability to upload spatial data such as KML and Shapefiles. Spatial data is displayed on a map view within the database. Temporal components include the ability to enter dates as calendar dates, ranges, fuzzy dates or radiocarbon dates, with confidence levels. Dates are displayed on a timeline generally linked to the map display.
As of end 2021 Heurist is supporting a coupe of hundred projects on the public servers, ranging from large ERC, AHRC, ANR and ARC to many small personal projects such as PhD research, primarily in Humanities disciplines.

Example applications

A more extensive list of examples can be found at http://HeuristNetwork.org/Projects

Recent projects (last 5 years)

tbc

Older projects

These projects remain active
  • Beyond 1914 and Expert Nation - records of university staff and students involved in WWI. Developed 2013 & 2016. A website for the University of Adelaide runs off the same database.
  • Virtual Museum of Balinese Paintings - research into 20thC Balinese paintings which links to works scattered across multiple collections in various countries. Developed ~2010.
  • Digital Harlem - search and mapping of events from 1915 to 1930 Harlem. First developed in 2003 and transferred to Heurist ~2013

    Past projects

These projects are complete or no longer active.
  • Heurist was used as the database to manage the cultural heritage information for nomination of the World Heritage Site Bahrain Pearling Trail, which was successfully inscribed on the UNESCO World Heritage List in 2012. Cultural Heritage Managers at the former Ministry of Culture in Bahrain used Heurist to collate, analyse, manage and assist with the vast array of data associated with the nomination. This data included spatial polygons defining the properties to be included in the World Heritage Site, details of the properties, details of people associated with the properties, associated photographs, documents and plans, including architectural plans and legal documents. These items were all cross-referenced with intuitive relationships defining how they were associated with each other. This database was referred to in the Nomination file, accepted by UNESCO in 2012.
  • Federated Archaeological Information Management System - generation of database schemas and interoperability with Android field data collection system. Development of a new version of FAIMS in 2021/2022 has incorporated some of the application building and data management functions originally offered by the Heurist integration, while changes in format will require reprogramming of interoperability.
  • the Dictionary of Sydney - born digital in Heurist from 2006, the public web site was generated directly from the Heurist database until ~2016, when the project was transferred to the State Library of NSW and converted to their internal systems.
  • the Australian Broadcasting Corporation Gallipoli project . - events stored in Heurist and generated as XML for input to the visualisation
  • Early Agricultural Remnants and Technical Heritage Programme - database of photographic and video recordings of agricultural practice