Next after [BOORU CHARS 2024](https://n.acg2.icu/view/1927862) volume of several imageboards image stream
based on **danbooru** (safe+questionable, **ID 8200000..9100000 = 24.09.2024..23.04.2025**),
with added "the best of" furry-related **e621** and loli-enabled **gelbooru** for the nearly same interval
and also unique **zerochan** content for **ID 3960000..4430000 = 13.06.2023..06.03.2025**.
As usual :
- images initially filtered Mpixels>=0.48, shorter_side>=600 px, volume>=60000 bytes, no animations
stripes dropped or cropped to aspect ratio 0.4..2.1
- PNG/WEBP/AVIF converted to JPG using **cjpegli 96% quality** (2000000 bytes limit)
modest downsampling done to longer side 2560px (landscape) 1920px (1x1) 2480px (portrait)
- verbose file naming used **"%website% - %id% - %up_to_3_copyrights% ~ %up_to_5_characters% (%up_to_2_artists%).jpg"**
files uniquely identified by "%website%+%id%"
- some general image statistics got with EXIFTOOL and [IMAGE MAGICK](https://imagemagick.org)
- content analisys was mostly the same as BC2023 with actual software and models
- [CRAFT text detector](https://github.com/fcakyon/craft-text-detector) used to estimate total size and number of text pieces
- torso components detected with [custom PyTorch model](https://github.com/aperveyev/booru_yolo/tree/main/models)
being built over [Ultralitics YOLOv11](https://github.com/ultralytics/ultralytics)
- clustering and sorting inside cluster implemented to arrange compositionally and visually similar pictures
**inspect "readme" for details**
- images deduplicatied using [AntiDupl](https://github.com/ermig1979/AntiDupl) up to 3-4% similarity along with BOORU CHARS 2024, 2023 and 2022
- semi-automated quality check done as follow
- real-life photos, no-character landscapes, foods and macro thrown away
- most of comic and N-koma, overtexted images and line-arts filtered out
- too "questionable" images (uncensored nipples or vulva, obvious adult actions) moved to [specific release](https://sukebei.n.acg2.icu/view/4284975)
- some background crops, gamma correction, rotation, denoise and other nontrivial improvements implemented
Beside images release contains tab separated texts :
- **BC_2025.tsv** file/image related metadata **896.142 rows**
- **BC_2025_tags.tsv** tags list with enrichment
- **BC_2025_yolo.tsv** detailed results for torso components detection
- **BC_2025_yolov11m_aa22.pt** PyTorch YOLOv11 model
and also additional "readme".
Keep in mind this release (just like all others BOORU CHARS) is first of all
**a uniform dataset of character-centric art in effective local format suited for batch processing**
and then
**a representative catalog of anime/game/cartoon copyrights, characters and artists for visual estimation**
but
**not offer high image resolution and pretending on completeness.**
Sample contact list with detections for volume 3x4-EP00 == single head, profile / from behind POV, full body, best quality volume

**WARNING** content is a little more NSFW compared to predecessors. Such themes wasn't allowed before.

Comments - 0