This study investigates the issues of visual-sensor-assisted aerial robot navigation. The major objectives are to provide an aerial robot the capabilities of localization and mapping in global positioning system (GPS) denied environments. When an aerial robot navigates in a GPS-denied environment, the visual sensor could provide measurements for estimation of the robot’s state and environmental mapping. Considering the carrying capacity of an aerial robot, a single camera is used in this study and the image is transmitted to a PC-based controller for image processing using a radio frequency module. An extended Kalman filter is used as the state estimator to recursively predict and update the states of the aerial robot and the environmental landmarks. The contributions of this study are twofold. First, an ultrasonic sensor is used to provide one-dimensional distance measurements and solve the map scale determination problem of monocular vision. Second, the image depth is represented using the inverse depth parameterization method and initialization of the image features is achieved by a non-delayed procedure. The software program of the robot navigation system was developed in a PC-based controller. The navigation system integrates the sensor inputs, image processing, and state estimation. The resultant system was used to perform the tasks of simultaneous localization and mapping for aerial robots.