FieldHelper 2006 Report

1. Introduction

The FIDAS project was funded by APSR January to December 2006 to assist field researchers to implement international standards in data creation and description in order to facilitate a sustainable workflow for creating Submission Information Packages under the OAIS model. This was planned through development of a data model for academic field research, a middleware tool "FieldHelper", and dissemination of the project results to the research community through a workshop and guidelines.

FIDAS addressed the following APSR Program specifications:

  1. Program Milestone: Develop implementation of generic tools to facilitate interoperability and sustainability;
  2. Program Objective: Develop middleware and tools to enable sustainability;
  3. Program Deliverable: Middleware and tools to facilitate sustainability;
  4. Program Strategic Task: Develop and implement appropriate middleware and tools to enable demonstrators.

The FIDAS team was led by Linda Barwick (PARADISEC, University of Sydney) and Ian Johnson (ACL, University of Sydney) with considerable input from Tom Honeyman (PARADISEC), Steven Hayes (ACL) and Kim Jackson (ACL).

2. Rationale and objectives

Fieldworkers typically collect data in a rather ad-hoc way during fieldwork, often leading to patchy and highly variable metadata quality at the time of submission to a digital repository. It can be very difficult or even impossible to reconstruct some of this information at a later date, yet these resources are often unique and unrepeatable records of highly significant events collected at considerable expense of researcher time, effort and resources. From the repository perspective, lack of metadata (including preservation metadata) can have serious implications not only for ingestion into a repository, but also for subsequent archival management and dissemination of archival information. This project aimed to extend the scope of the OAIS model to facilitate sustainable data collection and description of digital objects from the time of creation during fieldwork, and to integrate this workflow with repository ingestion, management, and dissemination requirements.

Objectives formulated at the beginning of the project included:

  1. Development of a data model for description and organisation of field data, leading to a tool “FieldHelper” that will prompt fieldworkers to record relevant preservation, description and structural metadata about their resources and their inter-relationships at the time of data creation in the field, and from this to output METS documents to accompany submission of their digital data to relevant repositories. This will be based on APSR-approved standards (e.g. METS and OAIS), and interface with the University of Sydney testbed repositories (PARADISEC, Archimage and Usyd D-Space), but should also be applicable to other APSR projects and repositories;
  2. Fieldtesting the model and FieldHelper tool with various ARC-funded data-intensive field research projects, including the National Recording Project for Indigenous Performance in Australia, Aboriginal Child Language Acquisition, the Angkor Project, PARADISEC and others (including training of researchers , gathering and assessing feedback to feed into further development of FieldHelper);
  3. Knowledge transfer to the wider research community through a) an international workshop “Sustainable Data from Fieldwork”, b) website publications and c) a final report including a recommended workflow and guidelines for field researchers.

3. FIDAS Activities, 2006

3.1 Development of the data model

In Semester 1, PARADISEC project coordinator Tom Honeyman undertook a series of interviews with researchers undertaking fieldwork in a variety of disciplines, including linguistics, ethnomusicology, archaeology (GIS) and botany, to ascertain the amount and typical formats of digital data collected during fieldwork. In conjunction with Steven Hayes (ACL) he undertook analysis of researcher requirements and a number of relevant middleware tools already available, with a view to identifying the key needs to be addressed in the tool development. Since researchers are already dealing with complex contingencies and variables during fieldwork it is desirable to avoid adding to researcher workload by:

  • matching and supporting existing workflows;
  • automating as far as possible extraction of technical metadata about file formats and resolution
  • focusing researcher input on describing resources according to their own categories to make the most of their expertise and time
  • making the user interface appealing and easy to use to encourage regular timely addition of metadata while in the field.

Hayes and Honeyman also reviewed relevant standards and XML schemas including MODS (Metadata Object Description standard) and METS (Metadata Encoding and Transmission Standard). The data model in diagrammatic form was circulated for comment to various researchers during semester 1 2006, and an initial interface concept design was completed in June 2006, with subsequent documentation, including an annotated MODS schema and MODS to Fieldhelper stylesheets, released through semester 2 (see Appendix 1: Timeline and Resources below). See also further discussion under 3.2. and 3.4. below.

3.2. FieldHelper Data Model: METS profile and the structural map

(Tom Honeyman)

For this project we consulted with researchers from a number of disciplines to see what kinds of digital and analogue data they might collect. Most notably we investigated the kinds of files created in linguistics, (ethno)musicology, archaeology, and botany. What is quite clear is that currently fieldworkers, especially in the humanities, collect and collate their data in fairly idiosyncratic ways.

Generically, in a typical field session a fieldworker might collect a variety of audio-visual and textual data, which would form a base of primary research data. Analysis materials ranged from notes (textual), to complex temporal-spatial mappings such as linguistic transcriptions or GIS paths. Analysis files are created using primary research data. In addition this, materials may have been used in the creation of primary data. We call these stimulus materials.

Roughly speaking these three classes of files (stimulus, research and analysis) fall on a time-line respectively before, during and after the period of actual fieldwork (although in reality real fieldwork is not so straightforward and contained).

Complementary to this is the notion of sessions and streams. Sessions are the discrete event around which stimulus, research and analysis files cluster. Streams are the groups of files created on single devices, such as a stream of audio recordings.

For further discussion of this terminology see Sessions and Streams and Field Helper Interface Concept Version 3 DRAFT.

The other tension in creating a structural map is the needs of the archive. For instance, archives such as PARADISEC impose a different structure over a collection of fieldwork data, to say, what DSpace is capable of. Therefore the data model needs to be abstract and flexible enough to output METS packages to match the requirements of a variety of archives.

3.3. Fieldtesting of the FieldHelper model

Various researchers associated with the project (including Linda Barwick and her collaborators) used a simple spreadsheet including the core descriptive elements of the data model to track creation of fieldwork data during a number of field trips in August-September 2006. As a result of this testing various changes were made to the core metadata set.

3.4. Development of the FieldHelper application - The design and production process in 2006

(Steven Hayes)

Steven Hayes and Kim Jackson of the ACL became involved in the FIDAS project in March 2006 and were specifically tasked with designing and creating the FieldHelper application.

In broad terms, the majority of the time allocated by the ACL to creating the application was taken up in conceptual design. The requirements for the application as initially presented suggested a number of solutions. In order therefore to set a clear programming path there were a good many discussions involving group based conceptual design work with white boards – predominantly with input from Tom Honeyman. Perhaps the defining breakthrough in the design process came in late May with emergence of the idea of “drag & tag” which was first documented in the June 7th “Field Helper Interface Concept Version 3 draft” and circulated amongst APSR stakeholders for comment. At this early stage of conceptual development, Ian Johnson’s detailed interface design suggestions in response to this document were invaluable. Many of the specific suggestions made by way of a multi page email and a hand drawn sketch were directly integrated into the design to the extent that the first Alpha release looks surprisingly similar to the pencil sketch.

In late June Steven Hayes and Tom Honeyman travelled to ANU to exchange ideas with the Bidwern development team. This meeting also involved presenting for the first time general architectural concepts for Field Helper along with a mock-up of the user interface. Feedback was positive and many suggestions were integrated into the overall design.

While conceptual design work was progressing, Kim Jackson worked to assess various development platforms and methodologies for Field Helper. After considering alternatives such as development in C++ and Delphi and discussion with other experts at the ACL, Java was chosen for its broad support on the Windows and Macintosh platform and because of the fact that many associated libraries were available in this programming language - Jhove probably being the best example here. In order to streamline development and provide a consistent look and feel across both operating systems, SWT was chosen and the basis for the Field Helper GUI and Eclipse was chosen as the project IDE.

In late August the first submission was made to the new Field Helper CVS repository but coding work did not seriously commence until early September. By that stage there was a clear enough understanding of the required architecture to allow development to proceed very quickly. Coding style has focused on simplicity and reusability of classes and this has resulted in a fairly small code base which relies heavily on pre built interface widgets drawn from SWT libraries and the use of XML and XPATH to store and retrieve data. Certain development targets were not met due to the vagaries of programming and the tightness of the project timeline however, a solid and apparently bug free Alpha release was possible in early December which has generated sufficient enthusiasm and support to allow these deficiencies to be made good over late December and early January.

Field Helper was first demonstrated as a working application to the participants of the annual APSR end of year meeting in early November as part of a Power Point presentation. The immediate response to the perceived simplicity and useability of application for the handling of complex metadata was very positive – Colin Webb, Director of Preservation Services at the National Library of Australia was for example moved to make contact with the development team to request that he be involved in early alpha testing with a view to using the application on a key NLA project.

After a further month of development a more polished version of the application was alpha released to a group of 50 participants of the Sustainable Data from Digital Fieldwork conference. Initially a Power Point presentation which included a live demo of the application was given and this was followed up with a hands-on workshop session where users were taken through downloading and installing the application and following the step-by-step documentation. The success of this release can be gauged by the 20 participants who immediately signed up as alpha testers for the application.

Field Helper was initially conceived of as a specialised tool to assist practitioners of specific academic disciplines while gathering digital data in the field. With a greater understanding of the underlying structures being handled and the capabilities of the interface and its governing technologies, the designers have incorporated the needs of a broader audience and are now emerging as a powerful and easy to use generalised metadata enrichment tool.

This free website was made using Yola.

No HTML skills required. Build your website in minutes.

Go to and sign up today!

Make a free website with Yola