The hospital situation, timing, and patient restrictions have become obstacles to an optimum therapy session. The crowdedness of the hospital might lead to a tight schedule and a shorter period of therapy. This condition might strike a post-stroke patient in a dilemma where they need regular treatment to recover their nervous system. In this work, we propose an in-house and uncomplex serious game system that can be used for physical therapy. The Kinect camera is used to capture the depth image stream of a human skeleton. Afterwards, the user might use their hand gesture to control the game. Voice recognition is deployed to ease them with play. Users must complete the given challenge to obtain a more significant outcome from this therapy system. Subjects will use their upper limb and hands to capture the 3D objects with different speeds and positions. The more substantial challenge, speed, and location will be increased and random. Each delegated entity will raise the scores. Afterwards, the scores will be further evaluated to correlate with therapy progress. Users are delighted with the system and eager to use it as their daily exercise. The experimental studies show a comparison between score and difficulty that represent characteristics of user and game. Users tend to quickly adapt to easy and medium levels, while high level requires better focus and proper synchronization between hand and eye to capture the 3D objects. The statistical analysis with a confidence rate(α:0.05) of the usability test shows that the proposed gaming is accessible, even without specialized training. It is not only for therapy but also for fitness because it can be used for body exercise. The result of the experiment is very satisfying. Most users enjoy and familiarize themselves quickly. The evaluation study demonstrates user satisfaction and perception during testing. Future work of the proposed serious game might involve haptic devices to stimulate their physical sensation.