ebup-viewer README
ebup-viewer Link to heading
Rust desktop reader for EPUB/TXT/Markdown with synchronized TTS playback, sentence highlighting, bookmark persistence, and a starter library flow (recent books + Calibre).
Current Project Status Link to heading
This project is actively developed and currently supports:
- Reading flow with sentence-aware highlighting and click-to-play from sentence.
- TTS synthesis through Piper (
piper-rs) with multi-process workers. - Audio playback through
rodio, with playback speed applied as post-processing (sonic-rs-sys). - Text normalization and chunking pipeline for TTS quality and stability.
- Starter mode for opening local files, recent books, and Calibre-backed books.
- Per-book persistent config/bookmark/cache with content-hash-based cache directories.
- Ctrl+C safe quit handling with config/bookmark save before exit.
Supported Source Formats Link to heading
.epub.txt.md/.markdown
Loading behavior:
.txtis read directly..mdand.epubattempt apandocplain-text conversion path first.- If
pandocconversion fails: .mdfalls back to raw markdown text..epubfalls back to native EPUB parsing (epub+html2text).
Image behavior:
- EPUB images are extracted and rendered in reading view.
- Markdown image links (
) are resolved and rendered when local files exist.
High-Level Features Link to heading
Starter mode with:
Local path open input.
Recent books panel (with cached cover thumbnails).
Calibre browser panel (sortable/searchable).
Reader mode with:
Page navigation.
Theme toggle (day/night).
Text-only and pretty-text modes.
Search panel (regex-based).
TTS controls with sentence-level navigation.
Settings panel and stats panel (mutually exclusive).
TTS behavior:
Play page from start.
Play from highlighted sentence.
Click any sentence to play from there.
Sentence seek forward/backward.
Auto-scroll and optional center-tracking.
Jump to currently spoken sentence.
Persistence:
Per-book bookmark (
page, sentence, scroll offset).Per-book UI/TTS config overrides.
TTS WAV cache.
Normalization cache.
Architecture Overview Link to heading
Top-level modules:
src/main.rs: process startup, config load, path-mode vs starter-mode app launch, Ctrl+C signal flagging.src/app/: GUI state/update/view, subscriptions, reducers/effects.src/epub_loader.rs: source loading and image extraction.src/pagination.rs: pagination from sentence stream into page text.src/text_utils.rs: sentence splitting with abbreviation handling and oversized-comma-chain splitting.src/normalizer.rs: TTS normalization, sentence/page caching, display/audio index mapping, long-sentence chunking.src/tts.rs: TTS engine facade, worker pool orchestration, cache lookups, playback append/time-stretch.src/tts_worker.rs:--tts-workersubprocess protocol and synthesis execution.src/cache.rs: bookmark/config/cache paths, recent books, thumbnails.src/config/: typed config models, grouped TOML schema, defaults, parse/serialize.src/calibre.rs: Calibre catalog loading, caching, thumbnail hydration, export/materialization.
App update split (src/app/update/):
core/mod.rs: subscription wiring (Tick, runtime events, signal polling).core/reducer.rs: message reducer and effect dispatch.core/runtime.rs: effect execution (save/load, quit, async tasks).core/shortcuts.rs: keybinding parsing/matching.appearance.rs: config mutations (theme, fonts, spacing, numeric edit input, window geometry).navigation.rs: page transitions and page-level state migration.scroll.rs: scroll tracking, bookmark persistence throttling, geometry-aware sentence targeting.tts.rs: user TTS actions and lifecycle glue.tts/transitions.rs: explicit TTS state transitions and mapping setup.tts/effects.rs: action-to-task/effect conversion.
Runtime Flow Link to heading
1) Startup Link to heading
- If process receives
--tts-worker, it runs worker mode and exits after protocol loop. - Otherwise main app installs Ctrl+C handler, initializes tracing, loads
conf/config.toml, and parses optional source path arg.
2) Starter Mode (no path arg) Link to heading
- Opens starter UI (
run_app_starter). - Recent books list is loaded from cache metadata.
- Calibre list can load immediately if enabled in
conf/calibre.toml.
3) Direct Book Mode (path arg) Link to heading
Source path is remembered in cache metadata.
Per-book cached config override is loaded if present.
Some fields are intentionally forced from base config to avoid stale per-book values:
log_leveltts_threadstts_progress_log_interval_secsall keybindings
Bookmark is loaded if present.
Source text and images are loaded.
Reader app starts and restores page/sentence/scroll when possible.
4) Reading and TTS Link to heading
Page text is represented as sentence lists.
TTS start request goes through transition logic:
normalize + map display sentences to audio sentences.
split initial batch vs append batch.
synthesize/cache missing audio in worker pool.
start playback with optional pause insertion.
Highlight index is updated from playback timing ticks and mapping.
Auto-scroll targets use geometry-aware estimates and guard bands to keep highlighted text visible.
UI and Layout Behavior Link to heading
Top Controls Link to heading
- Buttons include:
Previous,Next, theme toggle,Close Book, settings toggle, stats toggle, plus optional controls (Text Only/Pretty Text, TTS toggle, search toggle). - Top bar uses width planning (
src/app/topbar_layout.rs) to hide lower-priority controls when width is tight. - Control rows and TTS controls are fixed-height to avoid vertical text/button collapse.
Text Modes Link to heading
Pretty Text: page sentence view with clickable spans and sentence highlight.Text Only: normalized TTS preview with clickable spans mapped back to display sentence indices.
Settings Panel Link to heading
- Font family/weight, line spacing, pause-after-sentence, lines-per-page, margins, word/letter spacing.
- Auto-scroll toggle and center-tracking toggle.
- Day/night highlight RGBA controls.
- Numeric setting labels can be clicked to edit directly in a text box.
- Numeric text input validates range/type and shows red border when invalid.
- While numeric input is active, mouse wheel adjusts value by setting-specific step.
Stats Panel Link to heading
- Mutually exclusive with settings panel.
- Includes:
- Page index
- TTS progress (3 decimals)
- page/book ETA
- words/sentences on page
- percent at page start/end
- words/sentences read through current page
Search Link to heading
- Regex-based sentence search within current page context.
- In text-only mode it searches normalized audio sentences.
- In pretty mode it searches display sentences.
TTS, Normalization, and Quality Pipeline Link to heading
Playback Speed vs Synthesis Link to heading
- Synthesis is generated by Piper workers.
- Playback speed (
tts_speed) is applied later at playback append (time_stretch), not in synthesis generation.
Normalization (conf/normalizer.toml)
Link to heading
- Cleans markdown/link/citation noise.
- Expands abbreviations/acronyms and supports custom pronunciation maps.
- Supports sentence-level or page-level normalization cache modes.
- Performs long-sentence chunking for TTS (
chunk_long_sentences, char/word limits).
Mapping Model Link to heading
Normalization outputs:
audio_sentencesdisplay_to_audioaudio_to_display
These mappings are used to keep click-to-play, highlight, and auto-scroll aligned when one display sentence maps to multiple audio chunks.
Oversized Sentence Handling Link to heading
- TTS chunking limits are configurable (
max_audio_chars_per_chunk,max_audio_words_per_chunk). - Display sentence splitting also protects UI alignment for long comma/semicolon chains.
- This prevents giant single-span highlights and improves click/jump accuracy.
Configuration Reference Link to heading
Primary config file: conf/config.toml
[appearance]
Link to heading
theme:dayornightfont_family: enum fromFontFamilyfont_weight:light/normal/boldfont_size:12..36clampline_spacing:0.8..2.5clampword_spacing:0..5letter_spacing:0..3lines_per_page:8..1000clampmargin_horizontal:0..1000margin_vertical:0..100day_highlight: RGBA objectnight_highlight: RGBA object
Current defaults in code (src/config/defaults.rs):
font_size = 22lines_per_page = 700
[window]
Link to heading
width,height- optional
x,y
Window values are clamped and persisted.
[reading_behavior]
Link to heading
pause_after_sentence:0.0..2.0, slider step0.01auto_scroll_tts: boolcenter_spoken_sentence: bool
[ui]
Link to heading
show_tts: boolshow_settings: bool
[logging]
Link to heading
log_level:trace|debug|info|warn|error
[tts]
Link to heading
tts_model_path: Piper model path (.onnx)tts_espeak_path: root path for eSpeak datatts_speed: playback speed (0.1..3.0)tts_volume:0.0..2.0tts_threads: worker process count (min1)tts_progress_log_interval_secs:0.1..60.0
[keybindings]
Link to heading
Defaults:
toggle_play_pause = "space"safe_quit = "q"next_sentence = "f"prev_sentence = "s"repeat_sentence = "r"toggle_search = "ctrl+f"toggle_settings = "ctrl+t"toggle_stats = "ctrl+g"toggle_tts = "ctrl+y"
Notes:
- Shortcuts are normalized to lowercase.
spacebaralias is accepted forspace.- Extra unexpected modifiers cause a mismatch.
Normalizer Config Reference Link to heading
File: conf/normalizer.toml
Important keys:
enabledmode = "sentence" | "page"- whitespace cleanup toggles
- markdown/link cleanup toggles
- citation/bracket cleanup toggles
chunk_long_sentencesmax_audio_chars_per_chunkmax_audio_words_per_chunkmin_sentence_charsrequire_alphanumeric- replacement maps and token drops
- acronym expansion and letter sounds
- pronunciation controls:
- year mode
- brand map
- custom pronunciations
Calibre Integration Link to heading
File: conf/calibre.toml
Capabilities:
- load catalog from Calibre targets
- configurable columns and extension filter
- cached catalog with TTL
- thumbnail prefetch/cache
- local materialization/export for selected books
If disabled, starter UI still works for direct path and recent books.
Cache Layout and Persistence Link to heading
Root cache: .cache/
Per source (content-hash dir): .cache/<source-content-sha256>/
bookmark.toml: page/sentence/scrollconfig.toml: per-book settingssource-path.txt: canonical source path hint (for recent books)tts/tts-<hash>.wav: synthesized audio cachenormalized/: normalization cachess-<sentence-hash>-<config-hash>.toml(sentence mode)p<page>-<source-hash>-<config-hash>.toml(page mode)thumbs/cover-thumb.jpg: recent-book cover thumbnail
Cache key notes:
- TTS WAV key includes model path + normalized sentence text.
- Normalization cache keys include normalization config hash.
- Old cache entries are not auto-pruned.
Build and Run Link to heading
Build Link to heading
cargo build --release
Run in starter mode Link to heading
cargo run --release
Run with a specific book Link to heading
cargo run --release -- /path/to/book.epub
Requirements Link to heading
Required:
- Rust toolchain
- C/C++ build toolchain (
cc, linker,clangfor bindgen toolchains) cmake- ALSA runtime/dev (
libasound) - Piper voice model (
.onnx+ matching.onnx.json) - eSpeak data directory
Recommended:
pandocfor robust non-EPUB/plain conversion pipeline
Project-specific notes:
espeak-rs-sysis patched to vendored path inCargo.toml..cargo/config.tomlsetsCMAKE_ARGS = "-DUSE_LIBPCAUDIO=OFF".
Signal Handling and Safe Exit Link to heading
- Ctrl+C sets an atomic flag from signal handler.
- App polls system signals on subscription interval (
120ms). - On signal, app dispatches safe quit effect:
- save per-book config
- persist bookmark
- stop playback
- exit
Troubleshooting Link to heading
espeak-rs-sys transmute warnings
Link to heading
- Warnings from generated bindgen output are non-fatal.
Vulkan Unrecognized present mode ...
Link to heading
- Usually driver/backend informational (
wgpu-hal).
Missing/failed pandoc conversion Link to heading
- Reader attempts fallback paths for
.mdand.epub. - For non-EPUB formats beyond supported text/markdown, install/fix pandoc or use supported formats.
Cache confusion after normalization changes Link to heading
- Normalization changes should generate new normalized cache keys.
- If you want a clean slate, remove relevant per-book cache directories under
.cache/.
Dependency Compatibility Status Link to heading
- Current checked stack is stable with
piper-rs = 0.1.9andort = 2.0.0-rc.9in lockfile. - A full blanket
cargo updatecurrently pullsort/ort-sysnewer RCs and breakspiper-rscompile due upstream incompatibility. - Use targeted updates only until upstream versions align.
Development Link to heading
Useful commands:
cargo check
cargo test
cargo fmt --all
cargo clippy --all-targets --all-features
When editing config schema:
- update
src/config/models.rs - update
src/config/tables.rs - update
src/config/defaults.rs - update
conf/config.tomlsample - update README config reference
When editing TTS worker protocol:
- keep
src/tts.rsrequest/response structures aligned withsrc/tts_worker.rs
License Link to heading
See LICENSE.