0

Say I wanted to characterize motion in a video and derive a frequency of that motion, and the extents of the motion?

For instance: walking, jumping, running, punching, dancing, etc. The view might vary and be further away or closer to where it is harder to make out features.

I am assuming I would use some sort of AI to identify objects and perhaps bone structure. Then compare against a model to determine motion type, and then attributes. I have seen cameras get bone structure, but I think it required 2 views. Such as the cheap motion capture systems.

Could this be done in real time of video playing? Or would this need to preprocess and use the info created during playback of the media?

I really don't know much about AI models. I have been curious about how it can be used with video.

Comments
  • 0
    Looks like OpenPose could get me part of the way there.

    https://github.com/CMU-Perceptual-C...
  • 2
    PoseNet would be a good place to start in order to extract the skeleton from each frame. You can then use e.g. a simple Kalman filter to smoothly update the position of each vertex of the skeleton (new_position = kalman(current_position, position_from_image)).

    As for whether that could run in real time, PoseNet would probably require a decent GPU for that if you want to run it at 30 fps or more, but it's still a fairly lightweight model so it should be feasible.
  • 0
    @hitko Thanks, I will look into that. Didn't realize there were so many tools out there for this.
Add Comment