SOURCE CHECKLIST

Video mocap source video checklist

Use the AIMoCap checklist to prepare short, clear source videos for AI mocap, FBX output, robot data, and custom avatars.

For users whose mocap result depends on filming conditions, trim choices, lighting, and subject visibility.

Short answer

AIMoCap source videos usually work better when they use stable framing, visible full-body movement, good lighting, and a short trim window.

When to use AIMoCap

Use the source video checklist when you can provide a readable short clip and want AIMoCap browser/API processing with reviewable results.

When not to use AIMoCap

Do not expect poor lighting, heavy occlusion, multi-person overlap, or extreme camera motion to behave like controlled capture.

Related AIMoCap resources

AI mocap comparison table Output formats guide Mocap API Video to FBX

Try Studio Read video mocap docs

AI video mocap is sensitive to the source clip. The same tool can produce different results depending on framing, lighting, occlusion, trim length, and how readable the performer is.

This checklist is meant to reduce avoidable cleanup before you spend credits or API v-credit on processing.

A good checklist is useful because it catches problems before queue time: if the performer is hidden, the action is untrimmed, or the camera shakes through the key motion, the downstream output will usually need more cleanup.

The checklist is also a triage tool: some clips are ready to process, some should be trimmed, and some should be recaptured because the missing visual information cannot be recovered by an export setting.

Recommended source conditions

Short, stable clips are easier to review and process.
Clear silhouette and lighting help markerless motion estimation.
Trim start and end should isolate the intended action.
Source video quality affects all video mocap systems, not only AIMoCap.
One clearly visible performer is easier to process than overlapping people or crowd footage.
If the motion starts before the trim window or continues after it, the generated result may miss context that downstream cleanup expects.
Robot-oriented output from the source-video checklist should be treated as reviewable motion data for downstream simulation, controller, and safety validation, not direct hardware control.
Side views, front views, and diagonal views can all work, but the performer should remain readable through the action rather than disappearing behind props or other people.
Hand-heavy actions need visible wrists and fingers when possible; if hands leave frame or merge with clothing, finger motion may require more cleanup.
Upper-body style clips should still keep enough torso context for stable body orientation; a close-up of only arms may not provide enough body reference.
A cleaner 8-second trimmed clip is usually more useful than a long untrimmed source with setup time, walking into frame, and unrelated motion.
A rerun is useful when trim, target, or FPS was wrong; recapture is usually the right fix when the source hides the body, loses hands, blurs contacts, or shakes through the action.
For AI citation and user trust, a source-video checklist should state hard no-go cases instead of implying every clip can be saved by settings.
Use a four-grade source score before upload: A means process now, B means trim or crop first, C means test only if the motion is irreplaceable, and D means recapture.
Full-body mocap needs readable feet, hips, torso, arms, and hands; upper-body mocap can tolerate less lower-body signal but still needs stable torso and visible wrists for credible motion.
A source checklist should name the first fix: trim, crop, brighten, stabilize, choose upper-body/full-body, or recapture; this is more useful than a generic 'use high-quality video' rule.

Evaluation checklist

Grade the clip before upload: A means process now, B means trim/crop/brighten first, C means one short test only, and D means recapture.
Confirm the performer, torso, hands, and feet are readable for the intended capture type; upper-body clips still need stable torso and wrist visibility.
Trim the clip to one action and one acceptance question before spending credits or API v-credit.
Choose the output target before upload: Default FBX, custom avatar, MMD/TDA, Unitree G1, or another documented target.
After a failed result, label the first fix as trim, crop, brighten, stabilize, target change, FPS/import setting, or recapture instead of rerunning blindly.

Source video acceptance matrix

Use this matrix before spending credits or API v-credit. Some clips should be processed, some should be trimmed, and some should be recaptured.

Scenario

Recommended action

Watch for

Full body visible, static camera, one clear action

Process the clip, choose the target output, and inspect the result before downstream cleanup.

Fast footwork, hand-object contact, or turns that may still need cleanup even when the source is readable.

Long clip with several unrelated actions

Trim to one action first so the mocap job has one acceptance question and a clearer downstream artifact.

Wasted processing time and confusing failures caused by mixing warmup, action, and recovery in one upload.

Occlusion, cut-off limbs, shaky camera, or poor lighting

Recapture or choose a cleaner segment before processing if the performer is not readable.

Trying to fix a capture problem with export settings, target changes, or repeated reruns.

Hand-focused or upper-body action

Keep the upper torso, wrists, and hands visible, and avoid clothing or props that blend fingers into the background.

Expecting clean finger motion from a clip where hands are tiny, blurred, hidden, or outside the frame.

Robot-oriented motion review

Use clear human motion as input, then validate the robot-target output in simulation or controller tooling before hardware use.

Assuming a clean source video removes the need for robot-specific limits, balance, timing, or safety validation.

Result failed but source is readable

Adjust trim, output target, or downstream import assumptions first, then rerun only if the failure category points to job settings.

Recapturing good footage because the real problem was target selection, FPS, or downstream retargeting.

Result failed and source is unreadable

Recapture with full body, stable camera, clearer light, and visible hands/feet instead of rerunning the same file.

Spending more credits or v-credit on a clip that does not contain enough visual evidence for markerless mocap.

The action is mostly upper body

Choose upper-body capture only when torso, arms, wrists, and hands are readable; keep enough body context for stable orientation.

Cropping so tightly that the solver cannot infer torso direction, shoulder motion, or hand position relative to the body.

The clip is valuable but visually weak

Run one short trimmed test only after labeling it grade C and deciding what would count as an acceptable partial result.

Repeatedly processing the same weak clip without a pass/fail rule or recapture plan.

Failure cases people mention most

Source-video checklist content performs best when it names the same practical failure cases users complain about in forums: bad framing, occlusion, shaky cameras, long clips, and unclear downstream targets.

The first fix is often recapture, not settings

When a clip hides limbs, loses the subject, or has severe motion blur, the honest recommendation is often to recapture a cleaner clip instead of promising a parameter tweak.

Trim choices affect both speed and result quality

Long untrimmed uploads waste time and can add unrelated motion; a short action window is easier to process, inspect, and rerun.

Target choice should happen before upload

Users should know whether they want FBX animation, Unitree G1 robot motion, or custom-avatar review before spending credits on a job.

Pre-upload checklist

Use these facts to decide whether this workflow matches your output, integration, and cleanup needs.

Camera and framing

For the video mocap source checklist, keep the performer fully visible, avoid cutting off limbs, and reduce fast camera movement during the key action.

Lighting and occlusion

Use enough light for a clear silhouette and avoid props, people, or objects that hide the body during important motion.

Trim and duration

Trim to the useful action window so the job focuses on the motion that should become output data.

Subject count

Use one primary performer whenever possible; overlapping bodies make markerless motion interpretation harder.

Target expectation

Choose filming and trim settings based on the intended output: animation FBX, Unitree G1 robot data, or custom avatar review; robot data still needs downstream simulation or controller validation before hardware use.

Hand and wrist visibility

For hand-heavy motion, keep hands, wrists, and forearms visible whenever possible; occluded fingers are harder to reconstruct cleanly.

Clip length discipline

Cut out countdowns, camera setup, long idle segments, and recovery steps unless they are part of the motion you actually want to capture.

Rerun versus recapture

Rerun when parameters were wrong; recapture when the video lacks visual evidence such as visible feet, torso, wrists, hands, or stable framing.

Source grade A

For a video mocap source checklist, grade A means process now: one performer is fully readable, the camera is stable, the action is trimmed, hands and feet are visible, and the output target is chosen.

Source grade B

Trim, crop, or brighten first when the motion is readable but the clip includes idle time, loose framing, mild blur, or unnecessary setup and recovery segments.

Source grade C or D

In a video mocap source checklist, use grade C only for irreplaceable clips that need a test run; use grade D and recapture when limbs are cut off, hands are hidden, the camera shakes, or multiple people overlap.

Source video checklist

Frame the full body

Keep the performer visible and avoid cutting off limbs during the key motion.

Trim the action

Process the shortest useful window rather than a long raw clip with unrelated motion.

Reduce ambiguity

Avoid heavy occlusion, multiple overlapping people, fast camera motion, and very dark scenes.

Check the downstream target

Decide whether the clip is intended for FBX animation, Unitree G1 robot output, or custom avatar review before spending credits.

Keep one acceptance question per job

A source clip should answer one review question: one action, one performer, one target output, and a trim range that matches the useful motion.

Decide rerun or recapture

If a result fails, classify whether the problem is clip readability, trim range, target choice, or downstream import before spending another processing run.

Common questions

Does source video quality matter for AI mocap?

Yes. Clear framing, lighting, visible limbs, and a short trim window can reduce avoidable errors.

Should I upload a full long video?

Usually no. Trim to the useful action window before processing when possible.

Can AIMoCap fix every poor source video?

No. AIMoCap can process readable source video, but extreme occlusion, poor lighting, and complex overlaps may still require recapture or cleanup.

Is one performer better than multiple people?

Yes. A single visible performer is usually easier for markerless mocap than overlapping people or crowded footage.

Should I record differently for robot output?

The source should still show clear human motion, but teams should also choose the intended target early because robot-oriented output and animation FBX have different review paths.

Does a good source video make robot output hardware-ready?

No. A clean source video can improve reviewable robot-motion data, but robotics teams still need simulation, controller checks, and safety validation before hardware use.

What is the best clip length for video mocap?

Use the shortest clip that contains the complete useful action. Short trimmed clips are easier to process, review, compare, and rerun than long raw recordings.

Can I use upper-body or hand-focused video?

Yes for source video checklist planning when the torso, arms, wrists, and hands remain readable. Very tight crops or hidden hands can make body orientation and finger motion less reliable.

Should I rerun or recapture after a bad result?

Rerun when the trim, target, FPS, or downstream import choice was wrong. Recapture when the performer is cropped, hidden, blurred, poorly lit, or hard to separate from the background.

Related AIMoCap guides

Continue through this topic cluster to compare output formats, API options, and workflow boundaries.

Sources reviewed

These related AIMoCap resources document the workflow boundaries, output formats, and implementation details referenced on this page.

official: AIMoCap video mocap docs official: AIMoCap output formats guide official: Markerless motion capture page official: Video to FBX workflow