Basics before starting with Robotics — Part 9
Robots and Simulators
Previously, I wrote about Actions in ROS:
Today, I will talk about common robot subsystems and describe how the ROS architecture handles them. We will also talk about some robots and simulators in which we can most easily experiment with them.
Like all complex machines, robots are most easily designed and analyzed by considering one subsystem at a time. Broadly speaking, they can be
divided into three categories: actuation, sensing, and computing. In the ROS context, actuation subsystems are the subsystems that interact directly with how the robot’s wheels or arms move. Sensing subsystems interact directly with sensor hardware, such as cameras or laser scanners. Finally, the computational subsystems are what tie actuators and sensing together, with (ideally) some relatively intelligent processing that allows the robot to perform useful tasks.
Actuation: Mobile Platform
The ability to move around, or locomote, is a fundamental capability of many robots. It is surprisingly nuanced: there are many books written entirely on this subject! However, broadly speaking, a mobile base is a collection of actuators that allow a robot to move around. They come in an astonishingly wide variety of shapes and sizes.
Although legged locomotion is popular in some domains in the research community, and camera-friendly walking robots have seen great progress in recent years, most robots drive around on wheels. This is because of two main reasons. First, wheeled platforms are often simpler to design and manufacture. Second, for the very smooth surfaces that are common in artificial environments, such as indoor floors or outdoor pavement, wheels are the most energy-efficient way to move around.
The simplest possible configuration of a wheeled mobile robot is called differential drive.
It consists of two independently actuated wheels, often located on the centerline of a round robot. In this configuration, the robot moves forward when both wheels turn forward, and spins in place when one wheel drives forward and one drives backward. Differential-drive robots often have one or more casters, which are unpowered wheels that spin freely to support the front and back of the robot, just like the wheels on the bottom of a typical
office chair. This is an example of a statically stable robot, which means that, when viewed from above, the center of mass of the robot is inside a polygon formed by the points of contact between the wheels and the ground. Statically stable robots are simple to model and control, and among their virtues is the fact that power can be shut off to the robot at any time, and it will not fall over.
However, dynamically stable or balancing wheeled mobile robots are also possible, with the term dynamic implying that the actuators must constantly be in motion (however slight) to preserve stability. The simplest dynamically stable wheeled robots look like (and often are literally built upon) Segway platforms, with a pair of large differential-drive wheels supporting a tall robot above. Among the benefits of balancing wheeled mobile bases is that the wheels contacting the ground can have very large diameters, which allows
the robot to smoothly drive over small obstacles: imagine the difference between running over a pebble with an office-chair wheel versus a bicycle wheel (this is, in fact, precisely the reason why bicycle wheels are large). Another advantage of balancing wheeled mobile robots is that the footprint of the robot can be kept small, which can be useful in tight quarters.
The differential-drive scheme can be extended to more than two wheels and is often called skid steering. Four-wheel and six-wheel skid-steering schemes are common, in which all of the wheels on the left side of the robot actuate together, and all of the wheels on the right side actuate together. As the number of wheels extends beyond six, typically the wheels are connected by external tracks, as exemplified by excavators or tanks.
As is typically the case in engineering, there are trade-offs with the skid-steering scheme, and it makes sense for some applications, but not all. One advantage is that skid steering provides maximum traction while preserving mechanical simplicity (and thus controlling cost), since all contact points between the vehicle and the ground are being actively driven. However, skid steering is, as its name states, constantly skidding when it is not driving exactly forward or backward.
The inefficiencies and wear and tear of skid steering are among the reasons why passenger cars use more complex (and expensive) schemes to get around. They are often called Ackerman platforms, in which the rear wheels are always pointed straight ahead, and the front wheels turn together. Placing the wheels at the extreme corners of the vehicle maximizes the area of the supporting polygon, which is why cars can turn sharp corners without tipping over and (when not driven in action movies) car wheels do not have to skid when turning. However, the downside of Ackerman platforms is that they cannot drive sideways, since the rear wheels are always facing forward. This is why parallel parking is a dreaded portion of any driver’s license examination: elaborate planning and sequential actuator maneuvers are required to move an Ackerman platform sideways.
All of the platforms described thus far can be summarized as being non-holonomic, which means that they cannot move in any direction at any given time. For example, neither differential-drive platforms nor Ackerman platforms can move sideways. To do this, a holonomic platform is required, which can be built using steered casters. Each steered caster actuator has two motors: one motor rotates the wheel forward and backward, and another motor steers the wheel about its vertical axis. This allows the platform to move in any direction while spinning arbitrarily. Although significantly more complex to build and maintain, these platforms simplify motion planning. Imagine the ease of parallel parking if you could drive sideways into a parking spot!
As a special case, when the robot only needs to move on very smooth surfaces, a low-cost holonomic platform can be built using Mecanum wheels. These are clever contraptions in which each wheel has a series of rollers on its rim, angled at 45 degrees to the plane of the wheel. Using this scheme, motion in any direction (with any rate of rotation) is possible at all times, using only four actuators, without skidding. However, due to the small diameter of the roller wheels, it is only suitable for very smooth surfaces such as hard flooring or extremely short-pile carpets.
Because one of the design goals of ROS is to allow software reuse across a variety of robots, ROS software that interacts with mobile platforms virtually always uses a Twist message. A twist is a way to express general linear and angular velocities in three dimensions. Although it may seem easier to express mobile base motions simply by expressing wheel velocities, using the linear and angular velocities of the center of the vehicle allows the software to abstract away the kinematics of the vehicle.
Actuation: Manipulator Arm
Many robots need to manipulate objects in their environment. For example, packing or palletizing robots sit on the end of a production line, grab items coming down the line, and place them into boxes or stacks. There is an entire domain of robot manipulation tasks called pick and place, in which manipulator arms grasp items and place them somewhere else. Security robot tasks include handling suspicious items, for which a strong manipulator arm is often required. An emerging class of personal robots hope to be useful
in home and office applications, performing manipulation tasks including cleaning, delivering items, preparing meals, and so on.
Although there are exceptions, the majority of manipulator arms are formed by a chain of rigid links connected by joints. The simplest kinds of joints are single-axis revolute joints (also called “pin” joints), where one link has a shaft that serves as the axis around which the next link rotates, in the same way that a typical residential door rotates around its hinge pins. However, linear joints (also called prismatic joints) are also common, in which one link has a slide or tube along which the next link travels, just as a sliding door runs sideways back and forth along its track.
A fundamental characteristic of a robot manipulator is the number of degrees of freedom (DOF) of its design. Often, the number of joints is equal to the number of actuators; when those numbers differ, typically the DOF is taken to be the lower of the two numbers.
Regardless, the number of degrees of freedom is one of the most significant drivers of manipulator size, mass, dexterity, cost, and reliability. Adding DOF to the distal (far) end of a robot arm typically increases its mass, which requires larger actuators on the proximal (near) joints, which further increases the mass of the manipulator.
In general, six DOF are required to position the wrist of the manipulator arm in any location and orientation within its workspace, providing that each joint has full range of motion. In this context, workspace has a precise meaning: it is the space that a robot manipulator can reach. A subset of the robot’s workspace, called the dextrous workspace, is the region in which a robot can achieve all positions and orientations of the end effector. Generally speaking, having a larger dextrous workspace is a good thing for robots, but
unfortunately full (360-degree) range of motion on six joints of a robot is often extremely difficult to achieve at reasonable cost, due to constraints of mechanical structures, electrical wiring, and so on. As a result, seven-DOF arms are often used. The seventh DOF provides an extra degree of freedom that can be used to move the links of the arm while maintaining the position and orientation of the wrist, much as a human arm can move its elbow through an arc segment while maintaining the wrist in the same position.
This “extra” DOF can help contribute to a relatively large dextrous workspace even when each individual joint has a restricted range of motion.
Research robots intended for manipulation tasks in human environments often have human-scale, seven-DOF arms, quite simply because the desired workspaces are human-scale surfaces, such as tables or countertops in home and office environments. In contrast, robots intended for industrial applications have wildly varying dimensions and joint configurations depending on the tasks they are to perform, since each additional DOF
introduces additional cost and reliability concerns.
Robots must sense the world around them in order to react to variations in tasks and environments. The sensors can range from minimalist setups designed for quick installation to highly elaborate and tremendously expensive sensor rigs.
Many successful industrial deployments use surprisingly little sensing. A remarkable number of complex and intricate industrial manipulation tasks can be performed through a combination of clever mechanical engineering and limit switches, which close or open an electrical circuit when a mechanical lever or plunger is pressed, in order to start execution of a preprogrammed robotic manipulation sequence. Through careful mechanical setup and tuning, these systems can achieve amazing levels of throughput and reliability.
Another class of sensors return scalar readings. For example, a pressure sensor can estimate the mechanical or barometric pressure and will typically output a scalar value along some range of sensitivity chosen at time of manufacture. Range sensors can be constructed from many physical phenomena (sound, light, etc.) and will also typically return a scalar value in some range, which seldom includes zero or infinity!
Higher-order animals tends to rely on visual data to react to the world around them. If only robots were as smart as animals! Unfortunately, using camera data intelligently is surprisingly difficult. However, cameras are cheap and often useful for teleoperation, so it is common to see them on robot sensor
Interestingly, it is often more mathematically robust to describe robot tasks and environments in three dimensions (3D) than it is to work with 2D camera images. This is because the 3D shapes of tasks and environments are invariant to changes in scene lighting, shadows, occlusions, and so on. In fact, in a surprising number of application domains, the visual data is largely ignored; the algorithms are interested in 3D data. As a result, intense research efforts have been expended on producing 3D data of the scene in front of the robot.
When two cameras are rigidly mounted to a common mechanical structure, they form a stereo camera. Each camera sees a slightly different view of the world, and these slight differences can be used to estimate the distances to various features in the image.
As discussed in the previous section, even though visual camera data is intuitively appealing, and seems like it should be useful somehow, many perception algorithms work much better with 3D data. Fortunately, the past few years have seen massive progress in low-cost depth cameras. Unlike the passive stereo cameras described in the previous section, depth cameras are active devices. They illuminate the scene in various ways, which greatly improves the system performance. For example, a completely featureless
indoor wall or surface is essentially impossible to detect using passive stereo vision.
However, many depth cameras will shine a texture pattern on the surface, which is subsequently imaged by its camera. The texture pattern and camera are typically set to operate in near-infrared wavelengths to reduce the system’s sensitivity to the colors of objects, as well as to not be distracting to people nearby.
Although depth cameras have greatly changed the depth-sensing market in the last few years due to their simplicity and low cost, there are still some applications in which laser scanners are widely used due to their superior accuracy and longer sensing range. There are many types of laser scanners, but one of the most common schemes used in robotics involves shining a laser beam on a rotating mirror spinning around 10 to 80 times per second (typically 600 to 4,800 RPM). As the mirror rotates, the laser light is pulsed
rapidly, and the reflected waveforms are correlated with the outgoing waveform to estimate the time of flight of the laser pulse for a series of angles around the scanner.
Estimating the motions of the robot is a critical component of virtually all robotic systems, with solutions ranging from low-level control schemes to high-level mapping, localization, and manipulation algorithms. Although estimates can be derived from many sources, the simplest and often most accurate estimates are produced simply by counting how many times the motors or wheels have turned.
Many different types of shaft encoders are designed expressly for this purpose. Shaft encoders are typically constructed by attaching a marker to the shaft and measuring its motion relative to another frame of reference, such as the chassis of the robot or the previous link on a manipulator arm. The implementation may be done with magnets, optical discs, variable resistors, or variable capacitors, among many other options, with trade-offs including size, cost, accuracy, maximum speed, and whether the measurement is absolute or relative to the position at power-up. Regardless, the principle remains the same: the angular position of a marker on a shaft is measured relative to an adjacent frame of reference.
Just like automobile speedometers and odometers, shaft encoders are used to count the precise number of rotations of the robot’s wheels, and thereby estimate how far the vehicle has traveled and how much it has turned. Note that odometry is simply a count of how many times the drive wheels have turned, and is also known as dead reckoning in some domains. It is not a direct measurement of the vehicle position. Minute differences in wheel diameters, tire pressures, carpet weave direction (really!), axle misalignments, minor skidding, and countless other sources of error are cumulative over time. As a result, the raw odometry estimates of any robot will drift; the longer the robot drives, the more error accumulates in the estimate. For example, a robot traveling down the middle of a long, straight corridor will always have odometry that is a gradual curve. Put another way,
if both tires of a differential-drive robot are turned in the same direction at the exact same wheel velocity, the robot will never drive in a truly straight line. This is why mobile robots need additional sensors and clever algorithms to build maps and navigate.
Shaft encoders are also used extensively in robot manipulators. The vast majority of manipulator arms have at least one shaft encoder for every rotary joint, and the vector of shaft encoder readings is often called the manipulator configuration. When combined with a geometric model of each link of a manipulator arm, the shaft encoders allow higher-level collision-avoidance, planning, and trajectory-following algorithms to control the robot.
Impressive robotic systems have been implemented on computing resources ranging from large racks of servers down to extremely small and efficient 8-bit microcontrollers. Fierce debates have raged throughout the history of robotics as to exactly how much computer processing is required to produce robust, useful robot behavior. Insect brains, for example, are extremely small and power-efficient, yet insects are arguably the most successful life forms on the planet. Biological brains process data very differently from “mainstream”
systems-engineering approaches of human technology, which has led to large and sustained research projects that study and try to replicate the success of bio-inspired computational architectures.
Many of the robots used in research settings are custom built to
investigate a particular research problem. However, there are a growing number of standard products that can be purchased and used “out of the box” for research, development, and operations in many domains of robotics.
The PR2 robot was one of the original ROS target platforms. In many ways, it was the “ultimate” research platform for service-robotics software at the time of its release in 2010. Its mobile base is actuated by four steerable casters and has a laser scanner for navigation. Atop this mobile base, the robot has a telescoping torso that carries two human-scale seven-DOF arms. The arms have a unique passive mechanical counterbalance, which permits the use of surprisingly low-power motors for human-scale arms.
The PR2 has a pan/tilt head equipped with a wide range of sensors, including a “nodding” laser scanner that can tilt up and down independently of the head, a pair of stereo cameras for short and long distances, and a Kinect depth camera. Additionally, each forearm of the robot has a camera, and the gripper fingertips have tactile sensors. All told, the PR2 has two laser scanners, six cameras, a depth camera, four tactile arrays, and 1 kHz encoder feedback. All of this data is handled by a pair of computers in the base of the robot, with an onboard gigabit network connecting them to a pair of WiFi radios.
Fetch is a mobile manipulation robot intended for warehouse applications. The design team at Fetch Robotics, Inc. includes many of those who designed the PR2 robot, and in some ways the Fetch robot can be seen as a smaller, more practical and cost-effective “spiritual successor” of the PR2. The single-arm robot is fully ROS-based and has a compact sensor head built around a depth camera. The differential-drive mobile base has a laser scanner intended for navigation purposes and a telescoping torso. At the time of writing, the price of the robot has not been publicly released, but it is expected to be much more affordable than the PR2.
The NASA/GM Robonaut 2 is a human-scale robot designed with the extreme
reliability and safety systems necessary for operation aboard the International Space Station. At the time of writing, the Robonaut 2 (a.k.a R2) aboard the space station is running ROS for high-level task control.
The TurtleBot was designed in 2011 as a minimalist platform for ROS-based mobile robotics education and prototyping. It has a small differential-drive mobile base with an internal battery, power regulators, and charging contacts. Atop this base is a stack of laser-cut “shelves” that provide space to hold a netbook computer and depth camera, and lots of open space for prototyping. To control cost, the TurtleBot relies on a depth camera for range sensing; it does not have a laser scanner. Despite this, mapping and navigation can work quite well for indoor spaces.
Because the shelves of the TurtleBot are covered with mounting holes, many owners have added additional subsystems to their TurtleBots, such as small
manipulator arms, additional sensors, or upgraded computers. However, the “stock” TurtleBot is an excellent starting point for indoor mobile robotics. Many similar systems exist from other vendors, such as the Pioneer and Erratic robots and thousands of custom-built mobile robots around the world.
Although the preceding list of robots includes platforms that we consider to be remarkably low-cost compared to prior robots of similar capabilities, they are still significant investments. In addition, real robots require logistics including lab space, recharging of batteries, and operational quirks that often become part of the institutional knowledge of the organization operating the robot. Sadly, even the best robots break periodically due to various combinations of operator error, environmental conditions, manufacturing or
design defects, and so on.
Many of these headaches can be avoided by using simulated robots. At first glance, this seems to defeat the whole purpose of robotics; after all, the very definition of a robot involves perceiving and/or manipulating the environment. Software robots, however, are extraordinarily useful. In simulation, we can model as much or as little of reality as we desire. Sensors and actuators can be modeled as ideal devices, or they can incorporate various levels of distortion, errors, and unexpected faults. Although data logs can be used in automated test suites to verify that sensing algorithms produce expected results, automated testing of control algorithms typically requires simulated robots, since the algorithms under test need to be able to experience the consequences of their actions.
For many years, the two-dimensional simultaneous localization and mapping (SLAM) problem was one of the most heavily researched topics in the robotics community. A number of 2D simulators were developed in response to the need for repeatable experiments, as well as the many practical annoyances of gathering long datasets of robots driving down endless office corridors. Canonical laser range-finders and differential-drive robots were modeled, often using simple kinematic models that enforce that, for example, the robot stays plastered to a 2D surface and its range sensors only interact with vertical
walls, creating worlds that vaguely resemble that of Pac-Man. Although
limited in scope, these 2D simulators are very fast computationally, and they are generally quite simple to interact with.
Stage is an excellent example of this type of 2D simulator. It has a relatively simple modeling language that allows the creation of planar worlds with simple types of objects. Stage was designed from the outset to support multiple robots simultaneously interacting with the same world. It has been wrapped with a ROS integration package that accepts velocity commands from ROS and outputs an odometric transformation as well as the simulated laser range-finders from the robot(s) in the simulation.
Although Stage and other 2D simulators are computationally efficient and excel at simulating planar navigation in office-like environments, it is important to note that planar navigation is only one aspect of robotics. Even when only considering robot navigation, a vast array of environments require nonplanar motion, ranging from outdoor ground vehicles to aerial, underwater, and space robotics. Three-dimensional simulation is necessary for software development in these environments.
In general, robot motions can be divided into mobility and manipulation. The mobility aspects can be handled by two- or three-dimensional simulators in which the environment around the robot is static. Simulating manipulation, however, requires a significant increase in the complexity of the simulator to handle the dynamics of not just the robot, but also the dynamic models in the scene. For example, at the moment that a simulated household robot is picking up a handheld object, contact forces must be computed between the robot, the object, and the surface the object was previously resting upon.
Like all simulators, Gazebo is the product of a variety of trade-offs in its
design and implementation. Historically, Gazebo has used the Open Dynamics Engine for rigid-body physics, but recently it has gained the ability to choose between physics engines at startup. For the purposes of this book, we will be using Gazebo with either the Open Dynamics Engine or with the Bullet Physics library, both of which are capable of real-time simulation with relatively simple worlds and robots and, with some care, can produce physically plausible behavior.
There are many other simulators that can be used with ROS, such as MORSE and V-REP. Each simulator, whether it be Gazebo, Stage, MORSE, V-REP, turtlesim, or any other, has a different set of trade-offs. These include trade-offs in speed, accuracy, graphics quality, dimensionality (2D versus 3D), types of sensors supported, usability, platform support, and so on. No simulator of which we are aware is capable of maximizing all of those attributes simultaneously, so the choice of the “right” simulator for a particular task will be dependent on many factors.