Computer Vision Basics

Alex H. Macy
3 min readDec 8, 2020

The objective of computer vision is to make computers see and interpret the world like humans and possible even better than us.

Vision is not a simple task. First, the light rays of the ball pass through both eyes and strike on their respective retinas. The retinas do some prelim processing before sending the visual responses through optical nerves to the brain, where the visual cortex does analyzing. The brain taps into its knowledge base, classifies the object and dimensions, and having predicted the path, may decide to act on it by sending signals to move the hand and catch the ball.

Computer vision is concerned with the automatic extraction, analysis and understanding of useful information from a single image or a sequence of images. It involves the development of a theoretical and algorithmic basis to achieve automatic visual understanding.

Computer vision has a dual goal: aims to build autonomous systems to perform some of the tasks which the human visual system can perform and even surpass it in many cases. Many vision tasks are related to the extraction of 3D and temporal information from time varying 2D data, or in other words, videos.

Humans can recognize faces under all kinds of variations and illumination viewpoints, expressions, etc. Computers cannot.

The field of computer vision heavily incorporates concepts from the areas of digital signal processing, neuroscience, and artificial intelligence. AI and CV share topics such as pattern recog and ml. Consq, cv is sometimes seen as part of AI, or comp sci. Fueled by advances in comp arch and SE. Many methods in CV are based on stats, optimization, or geometry.

The CV pipeline begins with image acquisition. Solid state physics also related. Most CV dependent on electromagentic sensors that detect various invisible or infrared light. Optics is explained in physics. Image acquisition devices capture visual information as digital signals, and hence the need for digital signal processing. The field of digital image processing predominantly deals with image to image transformations. Typical image processing operations include compression, restoration, and enhancement.

Preprocess the image for high level analysis. This is where neuroscience plays a role; specifically, the study of the biological vision system. CV has overlap with computer graphics as well. Studies the techniques that produce image data from 3d models. Whereas, CV works to produce 3d models from image data.

Machine vision is the process of applying a range of technologies and methods to provide image based automatic inspection, process control, and robot guidance in industrial applications. Lightning is more controlled in machine vision that computer vision.

Photogrammetry also has an overlap with the field of computer vision; the science of making measurements from photographs, especially for recovering the exact positions of surface points captured in the images.

OpenCV for computer vision is a library for Python.

Computer vision is a subtopic of artificial intelligence; relevant to image processing, ml, robotics, graphics, and cognitive/neuroscience.

Computer vision emerged in the 1950s; replicating the eye, replacing the visual cortex and replicating the rest of the brain. 1957 breakthrough with Perceptron training.

Industrial Vision Systems in production lines. One of the key applications of CV is visual surveillance. Human supervision simply cannot scale up to the needs of visual surveillance. Too many objects and events to keep track of. Biometrics, fingerprint based identification and authentication. Face recognition. CV can be used for automation. Augmented reality is another application of computer vision.

--

--