Yakety is a voice transcription tool using a C-style C++ architecture with platform abstraction. The codebase follows C conventions with minimal C++ usage (only for whisper.cpp integration). Core features:
- Cross-platform: macOS (Objective-C/Swift) and Windows (Win32 API)
- Singleton patterns: Audio recorder, preferences, models
- Platform abstraction: Clean separation between core logic and platform code
- Minimal dependencies: Uses system APIs directly
File Extensions:
.c
- Pure C code
.cpp
- C++ code (only when whisper.cpp features needed)
.m
- Objective-C (macOS platform layer)
.swift
- SwiftUI dialogs (macOS only)
Naming Conventions:
// Functions: module_action_object
bool audio_recorder_init(void); // src/audio.c:82
int keylogger_set_combination(combo); // Keylogger API
void preferences_set_string(key, value); // src/preferences.h:24
// Types: CamelCase with descriptive names
typedef struct {
ma_device device;
float *buffer;
bool is_recording; // Atomic access required
} AudioRecorder; // src/audio.c:14-32
// Constants: UPPER_CASE
#define WHISPER_SAMPLE_RATE 16000 // src/audio.c:10
#define MIN_RECORDING_DURATION 0.1 // src/main.c:26
C-Style Casting:
AudioRecorder *recorder = (AudioRecorder *) pDevice->pUserData; // src/audio.c:39
const float *input = (const float *) pInput; // src/audio.c:45
Header Guards:
#ifndef AUDIO_H
#define AUDIO_H
// ... content ...
#endif // AUDIO_H
Audio Recorder (src/audio.h
, src/audio.c
):
// Global singleton instance
static AudioRecorder *g_recorder = NULL;
bool audio_recorder_init(void) {
if (g_recorder) {
return false; // Already initialized
}
g_recorder = (AudioRecorder *) calloc(1, sizeof(AudioRecorder));
// ... initialization ...
}
void audio_recorder_cleanup(void) {
if (!g_recorder) return;
// ... cleanup ...
free(g_recorder);
g_recorder = NULL;
}
Directory Structure:
src/
├── audio.h/c # Cross-platform core logic
├── utils.h # Platform abstraction interface
├── mac/ # macOS implementations
│ ├── app.m # NSApplication handling
│ ├── utils.m # Platform-specific utilities
│ └── dialogs/ # SwiftUI dialog implementations
└── windows/ # Windows implementations
├── app.c # Win32 application handling
└── utils.c # Platform-specific utilities
Interface Pattern (src/utils.h
):
// Cross-platform interface
void utils_open_accessibility_settings(void);
bool utils_set_launch_at_login(bool enabled);
double utils_get_time(void);
// Platform implementations differ:
// - src/mac/utils.m: Uses NSWorkspace, CFAbsoluteTimeGetCurrent
// - src/windows/utils.c: Uses ShellExecute, GetTickCount64
App Initialization Pattern (src/app.h
):
typedef void (*AppReadyCallback)(void);
int app_init(const char *name, const char *version, bool is_console, AppReadyCallback on_ready);
// Platform-specific implementations:
// - src/mac/app.m: Uses NSApplication, NSApplicationDelegate
// - src/windows/app.c: Uses CreateWindow, message pump
Atomic Operations (src/audio.c:41
, src/utils.h:43-46
):
// Thread-safe boolean access
bool utils_atomic_read_bool(bool *ptr);
void utils_atomic_write_bool(bool *ptr, bool value);
// Usage in audio callback (audio thread → main thread)
if (!utils_atomic_read_bool(&recorder->is_recording)) {
return;
}
Return Code Pattern:
// Success: 0, Failure: -1 or non-zero
int audio_recorder_start(void); // src/audio.h:17
int keylogger_init(callbacks, userdata); // Returns 0 on success
// Boolean for simple operations
bool audio_recorder_init(void); // src/audio.h:10
bool preferences_init(void); // src/preferences.h:9
Error Logging:
if (ma_device_start(&recorder->device) != MA_SUCCESS) {
utils_atomic_write_bool(&recorder->is_recording, false);
return -1;
}
Modal Dialog Implementation (src/mac/dialogs/dialog_utils.swift:18-23
):
func runModalDialog<T: View, StateType: ModalDialogState>(
content: T,
state: StateType,
windowSize: NSSize = NSSize(width: 400, height: 200),
windowTitle: String = ""
) -> StateType.ResultType
Dialog State Protocol:
protocol ModalDialogState: ObservableObject {
associatedtype ResultType
var isCompleted: Bool { get set }
var result: ResultType { get }
func reset()
}
- Core Logic - Implement in
src/
using C-style conventions
- Platform Interface - Add declarations to appropriate header (e.g.,
src/utils.h
)
- Platform Implementation - Implement in
src/mac/
and src/windows/
- Integration - Wire up in
src/main.c
app lifecycle
- macOS: Create SwiftUI view in
src/mac/dialogs/
- Windows: Implement Win32 dialog in
src/windows/dialog.c
- Interface: Add C function declaration in
src/dialog.h
Audio pipeline follows whisper.cpp requirements:
#define WHISPER_SAMPLE_RATE 16000 // Fixed 16kHz
#define WHISPER_CHANNELS 1 // Mono only
Recording flow (src/main.c:168-215
):
Key Press → audio_recorder_start() → data_callback() fills buffer
Key Release → audio_recorder_stop() → get_samples() → transcription_process()
Core Modules:
src/main.c
- App entry point and lifecycle (329-391)
src/audio.c/h
- Audio recording singleton
src/preferences.c/h
- Configuration management
src/models.c/h
- Whisper model loading
src/transcription.cpp/h
- Whisper.cpp integration (C++)
Platform Abstraction:
src/utils.h
- Cross-platform interface definitions
src/app.h
- Application framework interface
src/mac/
- macOS implementations (Objective-C/Swift)
src/windows/
- Windows implementations (Win32 C)
Build System:
CMakeLists.txt
- Main build configuration
cmake/PlatformSetup.cmake
- Platform-specific setup
cmake/BuildWhisper.cmake
- Whisper.cpp integration
#define WHISPER_SAMPLE_RATE 16000 // Audio format for transcription
#define WHISPER_CHANNELS 1 // Mono audio
#define MIN_RECORDING_DURATION 0.1 // Minimum recording length
#define PERMISSION_RETRY_DELAY_MS 500 // macOS permission retry delay
Development:
cmake --preset debug # Debug build with Ninja
cmake --preset release # Release build with Ninja
Windows Debugging:
cmake --preset vs-debug # Visual Studio generator for debugging
macOS Accessibility Permissions:
- Handle in
src/main.c:78-117
with dialog prompts
- Retry mechanism for permission granting
Thread Safety:
- Audio callback runs on separate thread
- Use
utils_atomic_*
for shared state access
- Main app state in
src/main.c:28-33
Model Loading:
- Single unified function:
models_load()
in src/models.c
- Handles download dialogs and fallback logic
- Path management through preferences system
Memory Management:
- Consistent use of
malloc/free
for C compatibility
- Audio buffer auto-resizing in
src/audio.c:54-66
- Caller owns returned buffers (e.g.,
audio_recorder_get_samples
)