Perspective Projection is used for representing a three-dimensional scene in a two-dimensional medium, like Computer Screen.
So how do we do it?
Here we are assuming everything is happening behind our Near plane or "the computer screen", in a Viewing frustum. So this provides the appearance that the farther away object is from the viewer, the smaller it appears.
As the final target is to map a 3D coordinate (x,y,z) to a 2D coordinate (x,y).
We try to get the point on the near plane, the line joining the 3D point in the Viewing frustum to the Camera intercepting on the vertical near plane, as shown in the below 2nd image (4 points in the Viewing frustum intercepting 4 points on the vertical near plane.)
Source: This awesome yt video https://www.youtube.com/watch?v=eoXn6nwV694 To refer to the derivation of the formula please refer to the video at timestamp 10:30.