Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

Welcome to the documentation for manga-image-translator-rust.

This book describes how to build, run and deploy the project, explains available modules, features, config options, examples, and developer docs.

Please read at least until 6. Python Renderer Usage before creating any issues

Quick Start

./mit-runtime -i in -o out

For more details see installation and Usage.

Installation

Binaries are available here for windows, linux and MacOs for arm64 and x86_64

For faster execution, it is recommended to install CUDA and cuDNN.

Install cuda 12.9

Install cudnn

If you use cuda delete the cudnn* file in the downloaded folder. Otherwise, delete the onnxruntime cuda execution provider

Usage

❯ cargo r -p simple-runtime -- -i path/to/input -o path/to/output
❯ ./runtime -i path/to/input -o path/to/output

Options:
  -i, --input <INPUT>    Input file or directory
  -o, --output <OUTPUT>  Output directory
  -c, --config <CONFIG>  Optional config file
  -v, --verbose...       Verbose mode (-v, -vv, -vvv)
      --overwrite        Overwrite already translated images
  -h, --help             Print help
  -V, --version          Print version

Only

  • coreml
  • cuda
  • cpu
  • tensorrt

is supported right now. For AMD support look at how to enable rocm for onnxruntime or maybe ZLUDA

Python Renderer Usage

The runtime allows to export the processed image, before the text is rendered. This output can be used with the original Renderer from the python project.

After running the runtime you can run the Python renderer script.

Install

# Setup virtual environment
python3 -m venv venv && source venv/bin/activate

# Install dependencies
pip install numpy Pillow git+https://github.com/frederik-uni/manga-image-translator.git@renderer-module#subdirectory=pip-modules/mit-renderer

# Install Python renderer script
curl -O https://raw.githubusercontent.com/frederik-uni/manga-image-translator-rust/master/scripts/python-render.py

# Download fonts
REPO="zyddnys/manga-image-translator"; FOLDER="fonts"; BRANCH="main"; mkdir -p "$FOLDER"; curl -s "https://api.github.com/repos/$REPO/contents/$FOLDER?ref=$BRANCH" | jq -r '.[] | select(.type=="file") | .download_url' | while read -r url; do fname=$(basename "$url"); fname=$(python3 -c "import urllib.parse; print(urllib.parse.unquote('$fname'))"); curl -L "$url" -o "$FOLDER/$fname"; done

Usage

❯ ./python-render.py -i path/to/input.mit.bin -o path/to/output.png
usage: python-render.py [-h] -i INPUT -o OUTPUT
                        [--renderer {Renderer.default,Renderer.manga2Eng,Renderer.manga2EngPillow}]
                        [--font-path FONT_PATH] [--line_spacing LINE_SPACING] [--no_hyphenation]
                        [--font_size FONT_SIZE] [--font_size_offset FONT_SIZE_OFFSET]
                        [--font_size_minimum FONT_SIZE_MINIMUM]

Modules

This project is composed of modular components. Short descriptions and links to subpages:

Detectors

See Detectors.

OCRs

See OCRs.

Upscalers

See Upscalers.

Inpainters

See Inpainters.

Translators

See Translators.

Detectors

ModelPaperTrainSource
dbnetARXIV ARXIV/GitHub
ctd//GitHub
dbnet_convnext//GitHub
Paddle/DocsGitHub

OCRs

ModelPaperTrainSource
manga-ocr/DocsGitHub

Upscaler

ModelPaperTrainSource
ESRGANarxivDocsGithub
Waifu2xarxivDocsGithub Maintained GitHub Original
Anime4kGitHub

Inpainter

ModelPaperTrainSource
Lama AOT
Lama Largearxiv arxivGitHub HomePage GitHub
Lama MPEarxivGitHub

Translators

CPP Dependencies

Roadmap

  • detectors
    • dbnet
    • ctd
    • paddle
    • dbnet_convnext
    • yolo5,
    • ysg
    • craft
  • ocr
  • inpainter
    • color
    • lama_aot
    • lama_large
    • lama_mpe
    • sd
    • patchmatch
  • colorizer
    • none
    • mc2
  • renderer
    • struct
    • gimp
    • pdf
    • psd
    • html
    • [~] png/jpeg/qoi
  • upscaler
    • anime4k
    • waifu2x
    • esrgan
  • translator
    • baidu
    • caiyun
    • google
    • m2m100
    • mbart
    • nllb
    • none
    • original
    • papgo
    • sugoi
    • jparacrawl
    • youdao
    • deepl
    • qwen2
    • chatgpt
    • groq
    • deepseek
    • gemini
    • sakrua
  • cleanup code
  • more tests(100% test coverage)
  • more benchmarks
  • optimize code
  • [~] error handling
  • replace clipper2
  • replace opencv
  • ci
    • cargo build
    • gh publish
    • cargo test
    • cargo fmt
    • cargo clippy
    • cargo doc
    • cargo tarpaulin
    • pyo3 publish
      • macos arm64
      • macos x86_64
      • linux x86_64
      • linux arm64
      • windows x86_64
      • windows arm64(no prebuild clang)
      • windows x86

Build

Preperation All

Install rust with rustup

Install cuda 12.9

Install cudnn

Preparation Ubuntu/Debian

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
apt update
apt install -y cuda-12-9 cudnn9

sudo apt-get install -y pkg-config libssl-dev libopencv-dev clang libclang-dev libfontconfig-dev

Preperation MacOS

brew install llvm opencv
# Old macs only
brew install openssl@3
# Run this on every terminal session(not actually required for debug builds/only release builds)
export OPENCV_LINK_LIBS=+static=opencv_core,static=opencv_imgproc,static=opencv_calib3d,static=libtegra_hal,tbb,static=ittnotify,framework=OpenCL,z

Preperation Windows

choco install opencv llvm

$env:OPENCV_LINK_LIBS = $libName # opencv_world*.lib. Its the only .lib file in the C:\tools\opencv if you use the prebuilts
$env:OPENCV_LINK_PATHS = $libPath # the parent folder of the opencv_world*.lib file. maybe "C:\tools\opencv\build\x64\vc16\lib"
$env:OPENCV_INCLUDE_PATHS = $includePath # most likely "C:\tools\opencv\build\include"
$env:Path = "C:\tools\opencv\build\x64\vc16\bin;" + $env:Path
$env:Path = "C:\Program Files\NVIDIA\CUDNN\v9.13\bin\12.9;" + $env:Path

or permanent

[System.Environment]::SetEnvironmentVariable("OPENCV_LINK_LIBS", "opencv_world4110", "User")
[System.Environment]::SetEnvironmentVariable("OPENCV_LINK_PATHS", "C:\tools\opencv\build\x64\vc16\lib", "User")
[System.Environment]::SetEnvironmentVariable("OPENCV_INCLUDE_PATHS", "C:\tools\opencv\build\include", "User")
[System.Environment]::SetEnvironmentVariable("Path", "C:\tools\opencv\build\x64\vc16\bin;" + $env:Path, "User")

Path to long error 1 Path to long error 2

Quick Start

git clone https://github.com/frederik-uni/manga-image-translator-rust --recursive

cargo r -p simple-runtime -- -i in -o out

Deploy

When releasing the application these files need to be included:

  • (cuda/cudnn)
  • opencv
  • onnxruntime exectuion providers
  • main binary

Binary Data Structure Version 1

Note:

  • All numbers are little-endian.
  • n indicates a previously read length.
  • ? means variable size (compute from other definitions).

Export

SizeTypeDescription
9_unknown/reserved
4uintversion
?Imageembedded Image
8uintnumber of patches
?×nPatchn patches

Image

SizeTypeDescription
2uintwidth
2uintheight
1boolraw
8uintbuffer length
nbytesbuffer

4PTS (4 Points)

SizeTypeDescription
644×[int,int]4 (x, y) coordinates

TextBlock

SizeTypeDescription
8uintfont size
8floatangle
8floatprobability
1_unknown/reserved
1boolfg_color available
0|1uintfg_r (if available)
0|1uintfg_g (if available)
0|1uintfg_b (if available)
1boolbg_color available
0|1uintbg_r (if available)
0|1uintbg_g (if available)
0|1uintbg_b (if available)
8uintoriginal text length
nbytesoriginal text
8uint4PTS count
n×644PTS4PTS data

Patch

SizeTypeDescription
8floatx
8floaty
?Imageembedded Image
?TextBlockembedded TextBlock