Identity-Preserving Animated Avatar System with Interactive Rendering.
Animate Me is a modular AI system that enables you to:
- Generate cartoon characters from text prompts or input images.
- Preserve user identity traits after stylization.
- Create motion (GIFs) from static images.
- Render characters inside an interactive, game-like environment.
The system is designed with production thinking: clear modularization, easy model replacement, and straightforward scalability.
- Multi-stage AI pipeline: generation β segmentation β pose β animation β rendering.
- Clear separation between model layer, orchestration layer, and interface layer.
- FastAPI + WebSocket integration for backend interaction.
- Interactive runtime support with Pygame.
The goal is to build a digital avatar that can:
- Preserve identity characteristics.
- Generate natural motion from static images.
- Support real-time interaction.
Key challenges:
- Identity Preservation: stylization often removes distinctive facial features.
- Static-to-Dynamic Conversion: generating smooth motion from a single input frame.
- Temporal Consistency: minimizing frame-to-frame flicker.
- Interactive Rendering: integrating animation into a runtime environment.
Animate Me addresses the entire pipeline rather than isolated sub-problems.
Input (Text / Image)
β
Text-to-Image / Upload Handler
β
Style Transfer Module
β
Object Decomposition & Face Segmentation
β
Pose Estimation
β
Motion / Animation Generator
β
GIF Export | Interactive Renderer (Pygame)
-
Model Layer
- Text-to-Image
- Style Transfer
- Segmentation
- Pose Estimation
- Motion Synthesis
-
Pipeline Layer
- Orchestration
- Action scheduling
- Frame generation
- Character state management
-
Interface Layer
- Streamlit demo
- FastAPI backend
- WebSocket server
- Pygame runtime
- Core: Python 3.8, PyTorch, OpenCV.
- AI Components: Text-to-Image, Style Transfer, OpenMMLab/MMPose, Segmentation.
- Backend/Runtime: FastAPI, WebSocket, Streamlit, Pygame.
- Environment: Conda (recommended), CUDA (optional).
- Receive input from text or image.
- Generate or normalize the character image.
- Apply reference style.
- Decompose foreground/background and segment the face.
- Estimate poses for each target action.
- Generate animation frame sequences.
- Export GIFs or render in an interactive environment.
animating_image/
βββ src/
β βββ app/ # FastAPI backend
β βββ demo/ # Streamlit demo
β βββ pipeline/ # Pipeline orchestration
β βββ animator/ # Motion synthesis engine
β βββ pose_estimator/ # Pose estimation
β βββ image_style_transfer/ # Stylization
β βββ concept_decomposer/ # Object decomposition
β βββ face_segmenter/ # Face segmentation
β βββ img_to_vector/ # Vectorization
β βββ render/ # Runtime/game rendering
β βββ text_to_image/ # Text-to-image
β βββ text_to_speech/ # TTS
β βββ configs/ # Character config
β βββ __main__.py
βββ external/ # Third-party models
βββ assets/
βββ notebook/
βββ requirements.txt
βββ environment.yaml
git clone <your-repo-url>
cd animating_imageConda (recommended)
conda env create -f environment.yaml
conda activate openmmlabpip/venv
python3 -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install -r requirements.txtGOOGLE_API_KEY=your_google_api_key
POSE_MODEL_CFG_PATH=/absolute/path/to/mmpose_config.py
POSE_MODEL_CKPT_PATH=/absolute/path/to/mmpose_checkpoint.pth
# Optional for API/runtime
STORAGE_ROOT=/absolute/path/to/storage
SERVER_IP=0.0.0.0
SERVER_PORT=8765
TARGET_OBJECT=/absolute/path/to/target_object.json
THIRD_PARTY_WEBSOCKET_URL=ws://host:portstreamlit run src/demo/app.pystreamlit run src/demo/create_animation_demo.pyuvicorn src.app.main:app --reloadpython -m srcpython -m src.test_pipeline
python -m src.test_animation
python -m src.test_pygame
python -m src.test_tts- Split the pipeline into independent modules.
- Containerize services when needed.
- Support GPU acceleration.
- Manage checkpoints/secrets via environment variables.
- Design APIs with stateless principles.
- Expand toward a microservice architecture when needed.
- Reduce actions/frames to improve speed.
- Cache intermediate outputs.
- Batch pose generation when appropriate.
- Animated Drawings (Facebook Research): https://github.com/facebookresearch/AnimatedDrawings.git
- Frontend (AnimGen Studio): https://github.com/tamchamchi/animgen-studio.git
- Missing API key or model path: check your
.envfile. srcimport errors: run commands from the project root.- Checkpoint not found: use absolute paths and verify files exist.
- Slow CPU execution: reduce actions/frames or use CUDA GPU.

