Applying Simple Animal Behavior Models to Programming Autonomous Robotic Swarms

A proposal to the Kentucky Space Grant Consortium

Principle Investigator: Dr. William J. Tietjen

Department of Biology

Bellarmine University

Louisville, KY 40205

MORE

This was approved and funded for July 2005- June 2006.


Abstract

This proposal requests funds from the Kentucky Space Grant Consortium to support research on autonomous robotic swarms (groups of robots). The swarms will be of the open-agent type where there is no dedicated leader and communication among robotic agents does not require a structured communication system. Additionally, the robots will be completely autonomous since a central host computer will not be used to coordinate their behavior and the relationships among agents, as is the case for most swam research.

 

The robots will be programmed using a paradigm known as “behavior-based” robotics. Invertebrate animal species will be used as models for programming the robots’ behavior. The animals will be presented goals similar to those required by the robots. Their solutions to these tasks will be video recorded and analyzed. The resulting data will be used to program the robots to mimic the behavior of the animals. The robots will be video taped while performing their tasks and the results will be compared to the animal models. If the robots do not behave as do the animals, their programming will be adjusted.

 

Four Sony AIBO robotic dogs will serve as test-beds for this project. The robots fulfill all the specifications for a behavior-based programming environment. These requirements include a hybrid deliberative/reactive system, short- vs. long-term behavioral memory, homeostasis, a variety of perceptual sensors, and a capacity for communication. In addition, the robots must be programmable to imitate various animal behaviors. Their rich array of sensors and their capacity to be programmed in both deliberative (goal-seeking) and reactive (reflexive) modes allow the AIBOs to interact with one another and their environment in real-time.

 

Robot swarms will be given two tasks to perform that require cooperation among the agents while simultaneously avoiding interfering with one another. The robots will accomplish these tasks by communicating over both visual and aural channels. Current robotic swarms that depend on a host computer for coordination become unwieldy as swarm size increases. It is likely the model I propose can support an arbitrary number of robots, arranged into “packs” of 3-5 agents. Robots could move freely from one pack to another while accomplishing their tasks.


Introduction

Researchers at the Jet Propulsion Laboratory (JPL) recognize that future planetary missions will require more autonomy than that of the Spirit and Opportunity rovers. Because of this, some JPL research has centered on simulating self-organizing robotic swarms with either closed- or open-agent architectures (Huntsberger, Aghazarian, Baumgartner, and Schenker 2000). Individuals in closed-agent systems interact over structured communication networks and there is a leader among the agents. For open-agent systems the communication is less structured, the leadership is dynamic, and agents can adapt to unstructured environments (Figure 0)

 


Figure 0. A comparison of closed- vs. open agent systems

 

At JPL these goal-oriented procedures are explored using a multi-robot control platform called BISMARC (Biologically Inspired System for Map-based Autonomous Rover Control; Huntsberger, 2001). BISMARC is a hybrid wavelet/neural network program that constructs an internal map modeled on a biological system; the hippocampus place cells. Such models are computationally expensive and require each robot to communicate its position to the host computer which then processes the data to send a map to each agent. The robot’s internal map is used to determine its position relative to other agents and topological landmarks. The map is continuously updated. This approach probably sets limits on the number of robots that can participate in a swarm since additional agents will eventually over-tax the capabilities of both the agents and host system.

 

Recent advances in the development of entertainment robots have produced robotic platforms that are accessible and serve as useful tools for education and research. One of the most powerful of these systems is the Sony AIBO robotic dog. (Datcu, Richertt, Roberti, de Vries, and Rothkrantz,  2004; Tejada, Cristina, O’Hara, and Tarapore. 2004). I propose programming AIBOs to achieve many of the same goals as BISMARC, but I will model behaviors that do not require maps of the surroundings or positions of the agents. Computationally simple behaviors will be used to direct the robots to goals where they will perform a task. By simplifying the programming I anticipate that the swarm will be more robust and will adapt quickly to changing environments. The proposed project mirrors many aspects of the BISMARC project, but depends on a different paradigm and is likely to support an arbitrary number of robots arranged into “packs” of 3-5 agents.

 

The Sony AIBO robot meets all the specifications for a behavior-based programming environment (Arkin, 1998). These requirements include a hybrid deliberative/reactive system, short- vs. long-term behavioral memory, homeostasis, a variety of perceptual sensors, and a capacity for communication. In addition, the robots must be programmable so they can imitate various animal behaviors.

 

The AIBO environment supports the deliberative (goal-seeking) and reactive (reflexive) models. In the deliberative mode the robot can be programmed to seek a ball of a particular color, follow a moving sound, and recognize and approach another robot. Built-in routines that fulfill the requirements of the reactive model include object and precipice avoidance, power management, and righting reflexes (if it falls over). Sensory input is reacted to continuously and updated in real time to allow adaptive responses to a changing environment.

 

The deliberative mode can be programmed explicitly (find and approach a laser spot painting a target).  Programming is simplified by calls to the operating system, Sony’s “robot abstraction”. To find a laser target, one simply indicates its color temperature (determined with the robot’s camera) and then calls the search and approach routines in the robot abstraction. There is no need to program the minutia of neck movements or searching for colors in the visual field. Those problems are handled by the robot abstraction (even though one could program at this level if required).

 

A second level of the robot abstraction is available to allow programmers access to the interactions between the robot’s “mood” and “personality” attributes. The robot mood matrix holds the abstraction that describes the robot’s disposition; from happy to sad or angry as determined by immediate events and through long-term memory. Immediate events that affect mood include falling over (increased “sadness”) and finding the ball (“happiness” increases). Other events can be programmed to affect the mood matrix and include verbal commands, finding an arbitrary object, and sounds.

 

The personality program is controlled by Sony’s complex state machine. For commercial programs the number of states that determine the overall complexity of the personality can be more than 3000. Conditions that cause state transitions (switching from one activity to another) can be greater than 11500. Training conditions that mimic learning can number over 1200. Both the mood matrix and conditions of the complex state machine can be programmed and set to a variety of values to permit simple to complex behavior patterns. Calls can be made to routines that recognize speaker-independent voice commands (more than 75), along with visual pattern recognition (including a human face). The results of these calls can be programmed to affect the state transitions. The short-term memory affects the immediate complex state machine while the long-term memory has a more general effect on the robot’s personality. The contents of the long-term memory can be stored on a PC for later analysis or to clone states among robots. The state transitions required for this project should only number several hundred and the more complex ones can be imported from public domain routines (i.e. www.aibohack.com). Thus, the programming required for the project is not overly-ambitious.

 

Arkin’s homeostatic mechanisms are controlled at the middleware level of the AIBO operating system and most are inaccessible with the available programming tools. These include monitoring of core temperature, joint movement, and power management. A jammed leg, for example, overrides all programming and causes a system shutdown while low power can initiate a search, approach, and mounting of the charging station.

 

A number of perceptual sensors are available including a ranging system, touch sensors, binaural microphones, and a color video camera with IR sensitivity. The processor is capable of real-time integration of these subsystems in both reactive and deliberative modes.

 

The last of Arkin’s requirements, a capacity for communication, is fulfilled at two levels. When in a human-interactive mode the robot communicates its intentions and the contents of the mood matrix or complex state machine through the pattern and color of lights, various tones, speech, and body language. Human trainers interact with the robot through voice commands, touch sensors, hand signals, or by showing them visual patterns. Robots can communicate with one another over the auditory/vocalization channels.

 

The robots will be connected to a wireless LAN system and their internal states will be recorded and monitored using the Tekkotsu programming system (www-2.cs.cmu.edu/~tekkotsu; Carnegie Mellon Univ.). The Tekkotsu language permits programs and data on the host computer to influence or record the robot’s behavior. These programs include MatLab, neural networks, and C++ routines, as well as others. Although it will not serve this purpose for the proposed experiment, the LAN can also coordinate the behavior of AIBO swarms (Stone, Dresner, Fidelman, Jong, Kohl, Kuhlmann, Sridharan, and Stronger 2004). Additionally, robots can enter a remote mode to allow direct human-robot intervention.


Project Description

Four AIBO robots, a wireless LAN network, the appropriate programming environments, and a well-equipped animal behavior laboratory are available to the PI. Facilities in the behavior lab include video recoding equipment, several computers, video digitizers, and proprietary software to track and analyze the movements of robots and animals (Tietjen, 2005).

 


Figure 1. A comparison of a taxis and two kineses.

 

Figure 1A shows how a directed response to a stimulus (taxis) can be used so the robot can self-navigate to a location close to the second goal (a gradient). A laser will be used to paint the floor with the first goal: to find and move to the laser target. This will place the robot a short distance from the secondary goal (a visual gradient). The robot will then switch goals to locate the center of the gradient. To do so, the robot will model a kinesis. Unlike a taxis, which requires a line-of-sight stimulus (finding the laser spot), the direction of the stimulus is unknown for a kinesis. Two types of kineses are seen in animal systems. An orthokinesis occurs when an animal changes its rate of locomotion; moving faster when the stimulus is weak, then slower when the stimulus is stronger (eventually coming to a stop; Fig. 1B). For a klinokinesis the animal increases its rate of turning as it moves into a stronger portion of the gradient (Fig. 1C). The proposed models are extraordinarily powerful. As an example, male silkworm moths can locate a female over 11 km distant by detecting her odor and using a combination of a taxis and klinokinesis. In addition, if wind disperses her odor plume, the behavior is so robust that males can reacquire it. Although visual, the second stimulus could represent gradients formed by the out-gassing of a vent on a planetary surface or the presence of methane in a search and rescue mission.

 

If the robot lowers its head to stare directly at the ground, visual stimuli (normally directional) can be modeled as a non-directional kinesis. When the camera is programmed with a region of interest, the sensitivity to non-directional cues can be adjusted. A ring of LED lamps will encircle the robot’s camera to avoid potential shading of the gradient by the robot’s movements. Unlike living animals, the robot can be programmed so it does not “cheat” by using the camera in a taxis mode to locate the center of the gradient. Later experiments will change the distance between the laser target and gradients and will increase the number of target gradients. Gradients that are found can be tagged with a flashing LED. Although I would tag these gradients, human interaction would not affect the integrity of the experiment since field robots could be designed with a similar tagging mechanism. Initially, the gradients will be a meter in diameter, but their size and shape will be varied. Gradients will be designed with a paint program and printed out. The proposed search routines are identical to those that might use a gas sensor rather than a camera and should be as capable as those seen in animal systems. 

 

Robot training will use explicit instructions (programmed through Tekkotsu), behavioral modification (human-robot interactions with reward and punishment) and a combination of each. Programs, the complex state machine, and mood matrices will be copied to the LAN host machine where promising behavioral modifications can be optimized. Favorable personalities will be cloned among all the robots.

 

The robots will be video taped while they perform their tasks. When digitized, the robot paths can be analyzed and compared to animal models. Various invertebrate species will also be taped and analyzed. For example, pill bugs moving in a humidity gradient would be an appropriate model for a robotic orthokinetic response. To collect data on predator search patterns, visual predators such as jumping spiders or praying mantises will be recorded. Foraging by an ant colony or bark beetles could be useful models for swam behavior. The results of the robot’s programming will be compared to an animal model and the program will be manipulated to mimic that model. One advantage of this approach is that robotic behavior will be directly based on problems evolution has already solved. Thus, there will be no need to “re-invent the wheel”.

 


Figure 2. Training a single robot.

 

Figure 2 depicts alternative training sessions. Training condition 1 is identical to that of Fig 1, as previously discussed. Training condition 2 places one or more obstructions between the robot and laser target. With its vision blocked, the robot only knows that the first goal’s target is available. It would then use a search algorithm to move about until it visually locates the laser target. Perhaps a modification of the kinesis program or a predator search pattern could be employed. Changes between a kinesis and a predator search pattern could prove useful if the laser target is far from the gradient goal. Once an appropriate behavioral model has been obtained, it can be propagated among all robots to serve as the initial behavioral platform for a robotic swarm.

 


Figure 3. Training a robot swarm.

 

Figure 2B depicts a model for an open-agent robotic swarm where the relationship among agents is fluid. Under the conditions shown in Fig. 2B, Supervisor A has more precise data concerning the location of the laser target than does Supervisor B (determined by measuring the diameter of the target in the robot’s field of view). The Unaware Agent has its line of sight blocked by an obstruction. Supervisor A therefore becomes the leader for this scenario. The location can be transmitted by Supervisor A to Supervisor B and the Unaware Agent so they can efficiently locate the laser target.

 

I plan on the robots communicating with through a series of tones (Figure 4; AIBOs can distinguish among different frequencies and determine their direction). Different tones will be used to announce to the rest of the swarm that an AIBO has found a goal, can not see a goal, or is lost from the group. For example, the Unaware Agent in Fig 2B could emit a tone indicating it can not see the laser target. Supervisor B could emit another tone indicating that it sees the goal and the Unaware Agent can move toward Supervisor B until its vision is no longer obstructed. Each robot can be tagged as an individual by varying the frequencies of their tones and by marking them with different colored tape. Thus, robots will know the location of swarm members though visual and aural cues. Agents will be programmed to occasionally vocalize a chirp to transmit their location to other group members. This should maintain swarm cohesiveness in much the same way as the series of contact notes (peeps) of domestic chicks keeps the flock close to their mother. For a robot deployed in the field, flashing lights or short-range transmitters could be used to identify individual agents.

 


Figure 4. Individual robots will recognize other agents by color tags and varying frequency tones.

 

Several secondary goals will be tested for the swarm. They might attempt to find the center of multiple gradients as quickly as possible without interfering with one another or the swarm could be required to locate and then sit at the gradient perimeter.

 

The success of the individual and robotic swarm models will be analyzed by measuring the time required to solve problems under different conditions and the robustness of the programs to changing complexity in the environment. I will measure interference among swarm members by monitoring robot collisions and instances where two or more robots approach a goal when only a single agent is required. All robots will have identical personalities which simplifies their programming and the requirements for an open-agent architecture. Only the criteria for recognizing other group agents will differ (visual and aural tags).

 

It is likely this model can support an arbitrary number of robots, arranged into “packs” of 3-5 agents. Robots could move freely from one pack to another as long as agents can recognize their “twins” in the new pack as individuals that should be ignored (it may be possible to eliminate this restraint). The presence of twin agents should not confuse other pack members and they will communicate with both as equals. The twins will not be confused since they are likely to ask different questions of the pack and will be listening for the appropriate answer. If the question is the same, then the answer will also be the same. If there is confusion among agents, the contact chirps could have different tones or attributes.

 

   
For the individualized pack, each agent has a "name" that is recognized by other members of the pack.
In the anonymous pack there are no individuals. An overly complex diagram with no explanation is here.

HERE IS A ARTICLE ON SWARMING ROBOTS USING THE SAME IDEA.

 

The LAN will record the behavior, internal states, and visual field for analysis. If, for example, a robot stops short of finding the gradient center, the LAN recordings may indicate that the mood matrix peaks too rapidly and needs adjustment. Examination of the visual field could show why one agent has trouble locating a second. Linking the Tekkotsu programming environment with MatLab will simplify these analyses. The output from the monitoring software can be synced with digitized video and audio to simplify analysis of individual and swarm behaviors.

 

I have always had an interest in modeling animal behavior to better understand its underlying principles. To that end I was an early adopter of the first programmable consumer robot, the Heath Hero 1. Unfortunately, that platform was inappropriate for all but the crudest behaviors. I tried simulation programs, but found the virtual worlds too restrictive. When a robot is simulated in a virtual world, it is too easy to tweak variables or command programs to produce the desired outcome, rather than base changes on animal behavior. Typically, programmers trained in computer science look to animals for inspiration, and then they apply their own models to “improve” on behaviors that have evolved over hundreds of thousands, or even millions of years. We intend to always return to the animals to solve problems using evolved behaviors as a base. The Sony AIBO system is a real-time reactive system that is more than up to the task of modeling the behavior of non-vertebrate species.  


Relevance to the NASA Program

The proposed project complements research at JPL by Huntsberger and his associates, but represents an approach that is considerably less processor-intensive. Also, swam size can be more readily scaled. It is not readily apparent to me how the BISMARC system could be easily scaled since individual robots need to monitored and controlled from a host system. If a gas sensor is used as an input, swarms built on the AIBO concept could be useful to NASA, DOD, Homeland Security, and in search and rescue operations. NASA and DOD are interested in examining existing technologies to determine if they can serve as platforms for development of applications that can be hardened for field purposes (Figure 5). The Sony AIBO and the programs I propose may represent one of these useful technologies.

 


Figure 5. Top: Methods usually followed by roboticists for a behavioral model. Bottom: Methods proposed for this project

 

One undergraduate student will be supported by this project to record, digitize, and analyze animal and robotic behavior. In addition, two students from our computer science department have volunteered their help for the 2005/2006 academic year. If funded, this project will also enhance a collaboration that began this year with faculty and students at the University of Louisville’s Speed School.



References

§   Arkin, Ronald C. Behavior-Based Robotics. 1998 MIT Press, 491 pp.

§   Datcu, D., Richertt, M., Roberti, T., de Vries, W. and Rothkrantz, L.J.M. (2004). AIBO Robot as a soccer and rescue game player. Proceedings of GAME-ON 2004, ISBN 90-77381-15-5, 45-49.

§   Huntsberger, T., H. Aghazarian, E. Baumgartner, and P. S. Schenker, (2000). "Behavior-based control systems for planetary autonomous robot outposts," in Proc. IEEE AEROSPACE 2000, Big Sky, Montana, pp. 679-685.

§   Huntsberger, T. (2001) "Biologically inspired autonomous rover control," Autonomous Robots, Vol. 11, No. 11, pp. 341-346.

§   Stone, P., Dresner, K., Fidelman, P., Jong, N., Kohl, N., Kuhlmann, G., Sridharan, M., and Stronger, D. (2004). The UT Austin Villa 2004 RoboCup Four-legged Team: Coming of Age. Technical Report UT-AI-TR-04-313, Austin, TX, 46 pp.

§   Tejada S., Cristina A., O’Hara R., and Tarapore S. (2004). Using Virtual Synergy for Artificial Intelligence and Robotics Education. AAAI Spring Symposium on Accessible Hands-on Artificial and Robotics Education, Stanford, CA, 5 pp.

§   Tietjen, W. (2005) Sublethal exposure to a neurotoxic pesticide affects activity of a variety of spider species. J. Arachnol. (in press).

§   Visser, A., Pavlin, G., van Gosliga, S., Maris, M. 2004. Self-organization of multi-agent systems. In Proc. Of the International workshop Military Applications of Agent Technology in ICT and Robotics.