Skip to content

Instantly share code, notes, and snippets.

@kartva
Last active September 4, 2024 19:05
Show Gist options
  • Save kartva/ff139f31876a42a0934ef84755596b2e to your computer and use it in GitHub Desktop.
Save kartva/ff139f31876a42a0934ef84755596b2e to your computer and use it in GitHub Desktop.
Google Summer of Code 2024 - Unicode Inc. - Implementing DurationFormat - Final Submission

Duration Formatter 4X

Author: Kartavya Vashishtha
Mentor: Younies Mahmoud
Tech Lead: Shane Carr

Introduction

ICU4X is a project designed to handle common software internationalization needs, such as formatting dates, times, and numbers across different locales. It aims to provide these capabilities in a manner that is lightweight, modular, and compatible with various operating systems and environments, specifically targeting client-side applications. The project emphasizes performance, data size, and ease of use, striving to support modern internationalization requirements.

In this report, we focus on the design and implementation of a Duration Formatter for ICU4X, a crucial component that enables users to format durations in a way that is not only localized but also heavily customizable.

Background

Duration representation is essential across various applications—from tracking time in project management tools to media playback and displaying timers on devices. These applications require durations to be easily readable and culturally appropriate across different locales.

Currently, ICU4X needs a dedicated duration formatter which converts standardized units like milliseconds, seconds, or minutes into a human-readable format, breaking down the duration into components (hours, minutes, seconds, … etc) and formatting them according to locale-specific norms.

Features of the Duration Formatter 4X

Note

In this report, we implement the ECMAScript proposal for Duration Formatter.

The key features proposed for the duration formatter in ICU4X are:

1. Locale-Aware Formatting

The formatter should recognize and apply locale-specific rules for representing durations, such as using different units or combinations.

2. Multiple Widths for Formatting

  • Full Width: This format would format the duration in a detailed manner, using full words (e.g., 2 hours, 5 minutes, and 12 seconds).
  • Short Width: A more concise representation, using abbreviated forms (e.g., 2hrs, 5min, 12sec).
  • Narrow Width: The shortest possible format, often using just numbers and single-letter abbreviations without spaces (e.g., 2h 5m 12s).
  • Digital Width: 2:05:12
  • Combined: 5 weeks, 3 days, 2:05:12.00750

3. Support for Different Time Units

The formatter should be capable of expressing durations that range from nanoseconds to years.

4. Fractional Units

The formatter should be able to express certain numbers as fractions of larger units. (e.g., 5.32s)

Implementation Highlights

1. Digital Duration Formatter

I implemented a new data struct for locale-specific CLDR digital duration data. ICU4X uses a complex data loading architecture to optimize for size, deserialization speed, and other criteria.

2. Duration Formatter

I implemented the full Duration Formatter algorithm in a no_std environment without any allocations. Duration Formatter internally uses List Formatter and Unit Formatter, which are existing ICU4X components.


Pull Requests:

Results

The resulting API looks like this:

let duration = Duration {
    sign: DurationSign::Negative,
    seconds: 5,
    milliseconds: 130,
    microseconds: 140,
    ..Default::default()
};

let options = DurationFormatterOptions {
    base: BaseStyle::Long,
    year_visibility: Some(FieldDisplay::Always),
    millisecond: Some(MilliSecondStyle::Numeric),
    ..Default::default()
};

let options = ValidatedDurationFormatterOptions::validate(options).unwrap();
let formatter = DurationFormatter::try_new(&locale!("en").into(), options).unwrap();
let formatted = formatter.format(&duration);
assert_eq!(formatted.write_to_string().into_owned(), "-0 years, 5.13014 seconds");

(I want to change ValidatedDurationFormatterOptions::validate(options) call to a options.try_into() but haven't done it as of writing this report)

Cross-Project Improvements

  • Using fixed_decimal (another ICU4X utility crate) motivated improvements to documentation and functionality.
  • Applying ListFormatter in Duration Formatter motivated the creation of the writeable::WithPart struct in the writeable crate, which is a utility crate in ICU4X.

Future Work

While the core algorithm may be implemented, stabilizing Duration Formatter still requires:

  • More tests, including from the Unicode Conformance Suite.
  • Benchmarks
  • FFI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment