Preliminary user evaluation of the Hi-Audio online platform for collaborative audio recording. Results from 22 participants including task-based performance, SUS, and NASA-TLX metrics.
View the Project on GitHub idsinge/hiaudio_user_evaluation_results
This repository contains the results of a preliminary user evaluation conducted for the Hi-Audio online platform, with a focus on its practical usability and applicability from the perspective of amateur musicians.
The study is directly linked to the following journal publication: Gil Panal, J. M., David, A., & Richard, G. (2026). The Hi-Audio online platform for recording and distributing multi-track music datasets. Journal on Audio, Speech, and Music Processing. https://doi.org/10.1186/s13636-026-00459-0
A preprint version is available at: https://hal.science/hal-05153739
Additional contextual information about the evaluation call and recruitment process is available here: https://hi-audio.imt.fr/agenda/user-evaluation-test/
Individual post-task questionnaire responses (including usability and workload assessments) are available in the surveys/ folder of this repository.
For a related practice-based evaluation of Hi-Audio, see: hiaudio_FLO_practice-based_evaluation
hiaudio_user_evaluation_results/
├── README.md # This file — study summary and results
├── SUPPLEMENTARY.md # Participant-level data tables and qualitative feedback
├── DATA_DICTIONARY.md # Column-level documentation for all data files
├── LICENSE # Creative Commons Attribution 4.0 (CC BY 4.0)
├── validate.py # Script to verify headline statistics from CSV files
├── docs/
│ ├── Hi-Audio_Evaluation Tasks.pdf # Task instructions given to participants
│ └── Hi-Audio_Participant_Comments_Anonymized.md # Per-participant task performance and comments
├── data/ # Raw and processed data for replication
│ ├── Hi-Audio_User_Evaluation_Survey.csv # Anonymized full survey responses
│ ├── Section_4_SUS.csv # SUS raw item scores (22 participants × 10 items, plain text)
│ ├── Section_4_SUS.xlsx # SUS raw items + scoring formula + summary statistics
│ ├── Section_5_NASA-TLX.csv # NASA-TLX raw dimension scores (22 participants × 6 dims, plain text)
│ ├── Section_5_NASA-TLX.xlsx # NASA-TLX raw scores + Raw Score formula + summary statistics
│ └── Task_Performance.csv # Binary task completion outcomes (22 participants × 10 tasks)
├── images/ # Figures and logos used in this README
│ ├── instruments_barchart.png # Reported instruments bar chart
│ ├── fig_task_time_pie.png # Task completion time distribution
│ ├── ERC_logo.png # European Research Council logo
│ └── lascene.png # La Scène logo
└── surveys/ # Individual participant questionnaire responses
└── <response_id>.pdf # One PDF per participant (22 files total)
The PDF files in the surveys/ folder are named after the unique response identifiers assigned by the JotForm survey platform. Each file contains a single participant’s answers to the post-task questionnaire, including their SUS and NASA-TLX responses and task completion self-report.
The data/ folder contains machine-readable files intended for result verification and replication. Each scale is provided in two formats: a plain-text CSV (primary, tool-agnostic) and an Excel workbook (xlsx) that documents the scoring formulas and includes precomputed summary statistics for transparency. Task completion outcomes are provided separately in Task_Performance.csv (Y/N per participant per task). The CSV files use P01–P22 participant labels consistent with the supplementary materials. SUS scores were computed using the standard formula (odd items: score − 1; even items: 5 − score; sum × 2.5) and SD reported as population SD (dividing by n=22). NASA-TLX raw scores are the unweighted mean of 6 dimensions, with the Performance dimension inverted (100 − value) before averaging; SD reported as sample SD of the 6 dimension means. Column-level documentation for all data files is in DATA_DICTIONARY.md.
The table below lists the authoritative source file for each headline result. All statistics can be independently recomputed from the CSV files alone.
| Result | Authoritative file |
|---|---|
| SUS scores (per participant + summary) | data/Section_4_SUS.csv |
| NASA-TLX scores (per participant + summary) | data/Section_5_NASA-TLX.csv |
| Task completion rate | data/Task_Performance.csv |
| Participant-to-submission-ID mapping | SUPPLEMENTARY.md — Participant Mapping table |
To verify the headline statistics reported in this README, run from the repository root:
python validate.py
This evaluation complements the technical validation of the Hi-Audio platform by examining real user interaction under realistic usage conditions. Rather than focusing solely on system performance, it aims to provide empirical insight into how musicians engage with the platform’s core features, including:
The primary objective is to assess usability, discoverability, and cognitive workload, thereby supporting the research and design goals presented in the associated publication.
Evaluation mode:
Age distribution:
Participants were recruited through La Scène (lascene.rezel.net), the student music association of Télécom Paris, which received a recruitment support fee of €40 per participant.
Participants exhibited strong musical profiles, with nearly all reporting prior musical training and regular practice. The most frequently reported instruments were: Piano (n=13), Guitar (n=7), Drums (n=6), Bass (n=5), and Voice (n=4). 13 out of 22 participants (59.1%) had experience with collaborative music recording.

The vast majority had a technical background (20/22, 90.9%), reflecting the recruitment context (students at Télécom Paris). Prior experience with digital audio workstation (DAW) software was evenly split: 11 participants (50%) had previously used a DAW and 11 had not, resulting in a heterogeneous yet representative user population in terms of audio production experience. Regarding audio round-trip latency — a concept directly tested in the evaluation — 13 participants (59.1%) reported familiarity with it, while only 4 (18.2%) had ever measured it in practice.
The evaluation consisted of two exercises, comprising ten task-based activities in total. The original PDF with the instructions can be found at docs/Hi-Audio_Evaluation Tasks.pdf. These tasks were designed to cover the platform’s main functionalities, including:
The evaluation was conducted under realistic and largely uncontrolled conditions: participants used their own devices and audio equipment, and all performed the tasks remotely and without supervision — with the exception of two individuals who completed the test in person. A variety of hardware and software configurations were therefore represented. Although the task instructions recommended the use of the Firefox browser and discouraged Bluetooth audio devices due to potential latency issues, some participants did not follow these guidelines. This variability increased ecological validity by more closely approximating real-world usage scenarios.
Tasks were described in a concise, minimally guided document, specifying what needed to be achieved but not how. This approach was intentionally chosen to:
After completing the tasks, participants answered a post-test questionnaire including:
Detailed scoring procedures and statistical analyses are provided in the supplementary materials of this repository.
Task success was determined by the evaluator through post-hoc review of the recordings and data generated by each participant on the platform, cross-referenced with the participant’s own survey responses. This approach allowed objective verification independent of self-report.
As participants used their own equipment, a wide range of hardware and software configurations was observed.
Operating system:
| OS | n | % |
|---|---|---|
| Windows | 14 | 63.6 |
| macOS | 4 | 18.2 |
| GNU/Linux | 3 | 13.6 |
| Android | 1 | 4.5 |
Web browser:
Despite the task instructions recommending Firefox, only 3 participants (13.6%) used it. Chrome/Chromium was the most common browser (n=11, 50.0%).
| Browser | n | % |
|---|---|---|
| Chrome/Chromium | 11 | 50.0 |
| Firefox | 3 | 13.6 |
| Microsoft Edge | 3 | 13.6 |
| Safari | 2 | 9.1 |
| Other | 3 | 13.6 |
Audio input and monitoring:
Most participants (n=17, 77.3%) recorded using their device’s built-in microphone; only 2 used an external audio interface. For monitoring, 10 participants used wired headphones or earphones, 7 used Bluetooth or wireless devices (despite guidance against this), and 5 used no headphones.
Across all participants, 72% of the assigned tasks were completed successfully (158 out of 220 task attempts across 22 participants × 10 tasks), indicating that most users were able to achieve the core objectives using the platform.
This suggests that the platform’s primary recording workflow is robust, while secondary features may require clearer affordances or onboarding.
| Exercise | Task | Description | Completed (%) | Completed (n) |
|---|---|---|---|---|
| Ex. 1 | T1 | Record in an existing collaborative composition | 68.2 | 15 / 22 |
| Ex. 1 | T2 | Manually annotate track metadata (language) | 31.8 | 7 / 22 |
| Ex. 1 | T3 | Update composition title with country/language reference | 59.1 | 13 / 22 |
| Ex. 1 | T4 | Select and clone an appropriate recording template | 77.3 | 17 / 22 |
| Ex. 1 | T5 | Estimate browser round-trip latency (screenshot) | 95.5 | 21 / 22 |
| Ex. 1 | T6 | Record a performance on top of existing tracks | 90.9 | 20 / 22 |
| Ex. 2 | T7 | Create a new collection | 72.7 | 16 / 22 |
| Ex. 2 | T8 | Create a composition within a collection | 77.3 | 17 / 22 |
| Ex. 2 | T9 | Record using a metronome and/or guide track | 100.0 | 22 / 22 |
| Ex. 2 | T10 | Invite a collaborator via email | 45.5 | 10 / 22 |
The post-task questionnaire provides complementary insight into participant background, execution conditions, and subjective perceptions of usability and workload. Individual participants’ questionnaire responses can be found in the surveys/ folder.

All participants reported that this was their first interaction with the Hi-Audio platform. Despite most participants having no prior experience measuring audio round-trip latency (see Participants), the majority were nevertheless able to complete the corresponding task (T5) successfully.
The SUS results indicate marginal overall usability, with substantial variability across users.
| Metric | Mean | SD | Median | Min | Max |
|---|---|---|---|---|---|
| SUS Score | 55.8 | 19.3 | 55.0 | 15.0 | 82.5 |
Workload scores indicate a moderate perceived workload, primarily driven by mental demand and effort rather than physical demand.
| Metric | Mean | SD of Dimension Means | Min | Max |
|---|---|---|---|---|
| NASA-TLX (raw) | 37.3 | 13.5 | 1.7 | 61.0 |
On a 0–100 scale, a raw score of 37.3 falls in the low-to-moderate range, suggesting that while the platform requires meaningful cognitive engagement, it does not impose high perceived burden on most users. The SD (13.5) is the sample SD of the 6 dimension means (Performance inverted); see SUPPLEMENTARY.md for details. As shown in the subscale breakdown below, Mental Demand and Frustration had the highest mean scores, whereas Physical Demand was low — consistent with the interaction-based nature of the tasks.
| Dimension | Mean | SD |
|---|---|---|
| Mental Demand | 54.36 | 25.22 |
| Physical Demand | 14.91 | 15.63 |
| Temporal Demand | 30.00 | 17.89 |
| Performance | 41.09 | 32.43 |
| Effort | 38.86 | 26.39 |
| Frustration | 44.50 | 31.24 |
Higher scores indicate greater workload. The Performance dimension is inversely scaled.
Overall, the evaluation indicates that the platform successfully supports its primary objective of collaborative audio recording for dataset creation. High success rates for recording and latency-related tasks demonstrate that core functionality can be used effectively by musicians without extensive prior training.
Although participants were not explicitly instructed to configure privacy levels or user roles, these features were implicitly used throughout the exercises. Post-hoc inspection of the generated data confirmed:
This suggests that access-control mechanisms are intuitively discoverable, even without direct instruction.
The post-task questionnaire included two open-ended questions that allowed participants to freely report difficulties encountered and provide general feedback. Analysis of these responses identified recurring themes:
Despite these limitations, combining objective task performance with subjective usability and workload measures provides a transparent and practical view of how musicians interact with the platform in real conditions.
The Hi-Audio platform is developed as part of the project Hybrid and Interpretable Deep Neural Audio Machines, funded by the European Research Council (ERC) under the European Union’s Horizon Europe research and innovation programme (grant agreement No. 101052978).

We also thank La Scène, the student music association of Télécom Paris, for their support in recruiting participants and facilitating the evaluation, as well as all the participants who generously contributed their time to this study.

If you use or reference the data or findings from this repository, please cite the published journal article. You may also cite the repository directly.
Gil Panal, J. M., David, A., & Richard, G. (2026). The Hi-Audio online platform for recording and distributing multi-track music datasets. Journal on Audio, Speech, and Music Processing. https://doi.org/10.1186/s13636-026-00459-0
BibTeX:
@article{GilPanal2026,
author = {Gil Panal, Jos{\'e} M. and David, Aur{\'e}lien and Richard, Ga{\"e}l},
title = {The Hi-Audio online platform for recording and distributing multi-track music datasets},
journal = {Journal on Audio, Speech, and Music Processing},
year = {2026},
issn = {3091-4523},
doi = {10.1186/s13636-026-00459-0},
url = {https://doi.org/10.1186/s13636-026-00459-0}
}
A preprint version is also available at: https://hal.science/hal-05153739
Repository citation:
Gil Panal, J. M., David, A., & Richard, G. (2026). Hi-Audio user evaluation results [Data repository]. GitHub. https://github.com/idsinge/hiaudio_user_evaluation_results
@misc{GilPanal2026data,
author = {Gil Panal, Jos{\'e} M. and David, Aur{\'e}lien and Richard, Ga{\"e}l},
title = {Hi-Audio User Evaluation Results},
year = {2026},
url = {https://github.com/idsinge/hiaudio_user_evaluation_results}
}
The contents of this repository are released under the
Creative Commons Attribution 4.0 International (CC BY 4.0) license.
You are free to share and adapt the material for any purpose, provided appropriate credit is given.
See the LICENSE file for full terms.