phil@robophil.com

Philip English

Robotics Enthusiast, Director, Investor, Trainer, Author and Vlogger

Roboception Interview with Dr Michael Suppa

Hi guys Philip English this from robophil.com. Welcome to the Robot Optimized Podcast where we talk about everything robotics related. For our next episode, we have Dr. Michael Suppa who will talk about us "Eyes and Brains for your Robot".

Philip English

Hi, guys. It’s Philip English here doing our Robo Philosophy interviews. Today we got Michael Super from Robo reception. And yeah, as normal we’re just going to have a run through, have a chat with Michael Cool just to get an understanding of himself and what Robo reception does. Yeah. Welcome, Michael.

Dr Michael Suppa

Hi Philip, nice to meet you today.

Philip English

Cool, thank you. Thank you. So to start with, I think I’m quite interested in getting a bit of background about yourself to start with, how you sort of came up with the idea and a bit of background and history for how you, I suppose got into the robot scene and the robot market.

Dr Michael Suppa

Yeah, sure. So basically started as studies in electrical engineering in the University of Hanover. Finished that studies and then I went to the German Aerospace Center as a research engineer first. And then we focused on robot vision for robotic systems, basically when I was working there like for 14 years. And then we decided that this three vision thing is something that we should actually bring to market. And based on that, we found a Roboception as a spin off from the German Aerospace Center of Robotics. And so we have a research background in robotics, so to say, and spun that off into like, like a startup company in 2015. And yeah, based on that we are working since mostly in industrial automation logistics and newly also lab automation, but also have close contacts to universities as I’m also a professor at the University of Greenman, so I still kept touch with a research community while being in a startup. So we are also doing a lot of innovation research projects together with the university. So, yes, we want to always keep up to date with the top latest developments from research and see on the other side what’s needed in the market. So that’s a little bit our positioning here.

Philip English

Right. Fantastic, fantastic. And then I suppose some of me so this is obviously. 3d camera perception. So what’s the difference between a normal camera and a 3D camera?

Dr Michael Suppa

Perfect. Sure. So usually when you have, like, a 2d camera, you either need to know the size of the object or the correct distance in between the camera and the object in order to do, like, postestination, which we are mostly. Doing for a 3d camera that comes basically out of the system itself because you either have, like, a stereo system, which we are mostly using, or, like, time of light or others, where you’re getting, like, the set dimension. Directly out of the camera so that you get more flexible on the positioning of the camera or the size and the scale of the object. And especially if you want to do robotics, usually you’re interested in picking up the part in the end from an arbitrary position. So that’s usually easier to be done with 3D vision than with 2D, especially when you have the parts in an unordered space in bins or something like this, where, like, the set dimension is changing and varying over time. So this is where 3D usually gives you already the distance to the object as opposed to something that you have to compute directly from the camera itself. On our end, you’re using stereo vision, and this helps us to do both, actually. We have a good two DB image and we have steps. So in our everyday life, you’re actually combining two D, three D at all times.

Philip English

Right I see. So as you’re saying, if you’re picking something from a bin, then obviously the camera sees it in 2D, but then it still may not have an understanding of how far the actual piece is, especially if it’s the corner piece. But with the stereo 3D camera system, it gives it a depth of field so it can go down there and actually grab at the right place. So that’s how it works. I’ve got a lot of experience with sort of LiDAR and sick laser scanners. Is that a different level? Is that something else? Or does that fit? Or can that fit into cameras as well?

Dr Michael Suppa

Yeah, so actually, I started my career a lot with laser scanning technology. There’s one issue with it that you usually have to move the sensor to scan. So this usually has an active motion. So basically the points that are taken at the beginning are taken at a different time than at the end. So if you have a camera, you have a nice shot in one image, so to say. So that’s one thing. Especially when you have moving objects or something like this, it’s more helpful to go with a camera. But still, the laser gives you a point cloud and maybe an intensity image, but you don’t get like a 2D image on top otherwise, unless you have like, a calibrated camera as well. And I see one thing that especially when you relate to edges and you want to go for corners that are interesting for you to detect the boundary of parts. It’s a lot depending on the scanning time of the laser, if you see them in the point cloud or not, because it’s really something that’s in the motion of the scanning process. It’s just an additional thing that you have to deal with when you want to do a detection. But most early 3D sensors have been lasers. I think nowadays it’s more time of light and stereo or structured light than the most common technologies that we are using nowadays.

Philip English

Well, it makes sense. Yeah. Because if you have a fixed camera and a conveyor belt, then obviously, unless you’re moving the camera around to actually get and enable the laser scanners to work efficiently, then the stereo makes sense. So you can see it. Yeah. Okay. Since that’s all. Okay. So for Robo reception, then. So what would you say, sort of the main three sort of problems that you see with vision?

Dr Michael Suppa

Yeah. So we are on our end mostly in the robot domain. So we are caring for vision problems with the robots to say. And they are basically for me it’s like two categories. So there’s known objects. So objects where we have models of where we basically just have to detect the correct position of the part and then there’s unknown parts which we don’t have models of. So where we have to determine the correct orientation, which is more difficult because we don’t have the model. So this is the two things that I see at the moment. The last and quite difficult challenge then is transparent objects, for example. So basically where you have to determine the correct position of let’s say a transparent object like glass or anything like that, which is always hard to see with the sensor. So at the moment it’s per domain. If you’re industrial automation, most of you have CD models, you can use them for detection. If you’re in logistics you never have any model because there’s a large variety of parts that’s coming. So usually you only have to do pick and drop. So no place that’s easier on that end. But the parts are large, the parts are very a lot. And then the last and level automation which we see at the moment is that we have transparent objects that we have to correctly detect and they’re mostly also fragile. So you have to approach them very correctly and not to break them. So that’s basically the challenge and the general thing, shiny and transparent, that’s usually the harder part. So to say. And if you’re going to logistics, both happen. So this is why they’re not known. They’re shiny and they can be transparent back. So this is one of the key challenges that we are seeing at the moment.

Philip English

Right , I see how about speed? So if a part is on a conveyor belt, is it is it going to be a certain speed or can you really ramp it up? I suppose cameras are fast so if it can probably go just as fast as any conveyor belt. But do you mind? Obviously the slower it goes, I’m guessing that more time the stereo’s got to work to actually find out where it is. So I mean, I suppose it’s. Have you seen a lot of applications where speed is being like a variable that you need to

Dr Michael Suppa

So from the reception side, we are not so much focused on like speed applications. In many cases, we are seeing it now that parts are separated on conveyors and then it’s fine and then you can have the speed. It’s more relevant than that. You have like constant speeds that you know because still independently on how the faster camera is, you detect and you grasp afterwards. So you need to have this distance traveled in between the detection and the grasping. So this is usually something where I think the speed is one in terms of the distance between the detection and when the part is sticked. But it’s more like also an engineering thing to keep the conveyor speed constant and make sure that the part is in the same orientation it was detected when it also hits. In many cases we are having scenarios where parts are on top of each other, even on the conveyor, and then usually they go at lower speeds because you don’t want them to fall around when they’re moving across the conveyor before that can be picked. So I think that it’s more or less a physics problem. So to say that the faster the go, the more shuffling of the objects you have in between. So usually the top speed that we are seeing at the moment is 0.5 meters/second, but conveyors can go up to 2 meters/second. So I would say we are more or less with the vision at the moment at the lower speed. Especially with stereo model cameras can be faster but then you have the set and constant speed issue as well.

Philip English

I see. And then so when you have the parts, is there a I’ve seen on some cameras you have a form of general intelligence where or like machine intelligence I should say that you have to teach the camera certain objects does Robocension work like that? So obviously like an object comes up and then it has to verify what it is and then you can essentially teach the cameras the next time it sees it, it actually recognizes it. Is that how it works or is that another part of the process?

Dr Michael Suppa

Basically what are we here doing is that we are having not only the sensor as like when we end up at the image of the point cloud, but in many cases our customers ask for this piece of software that you’re actually talking about that gives you the position of the classification of the part. And our sensors that we are having I have one here. So that’s the wizard. That’s our brain and there’s an Nvidia system in here so the brain is actually on the camera. So the decision what kind of part it is, how it’s being detected and so on, that’s actually done on the camera. So it’s actually a smart camera. So we have the stereo vision, this computation and then the customer gets a grass point or an object location or something like this. So that’s all in the camera which helps a lot to distribute computational power. Right, so that’s what we are doing. So basically our camera is SmartCam, it has the vision software on it. The only thing you do you need to do is teach the grass that’s something that the operator usually does because that’s a process knowledge. So where to pick the part, that’s something that you teach in and the rest is done out medically on the camera?

Philip English

On the camera, right. Fantastic. I mean, that’s probably a good leeway into sort of the line of products. Is it one core camera you got or is there a few in the product line?

Dr Michael Suppa

It’s. So usually though, so we have different versions on the baseline. So the stereo version, usually the baseline is determining the range that you can measure stereo. So this is a 65 millimeter base and it’s the same as a human eye, more or less. So you have like a good stereo range of a little bit over a meter, which is the same as a human. So you the reach of your arms. So to say this is your stereo range because manipulate objects. So this is basically the same. And this is a sensor that’s usually mounted on the robot. When you do this application and drive a camera there, we have a second one, that’s the 160 millimeter baseline, that is for external mounting. So when you look from above, you have the robot going in and out. So that’s basically the other scenarios that we are having. So we are having like these two baselines which cover a large variety of cases. And then we have another system that’s having a larger resolution and a larger baseline for like very tiny parts in a greater distance. So that’s basically the three hardware settings that we have. In robotics, I think there’s two baselines, they cover like 90% of the cases, specific setting that the customer likes and it’s a special camera. But we are more aiming for these 80% to 90% cases where the hardware is the same and you change a little bit on the software because from the business perspective, always easier to do it like that than to change the hardware. Always the same.

Philip English

Yeah, that’s right. And as you said, it makes sense to hit that 80 90% range to get most tasks that you likely find. I was with a company up in Scotland a few weeks ago and actually there’s another company doing it as well. But they deal with waste and they’ve been looking at a robot that can obviously go into a waste disposal unit and pick an object, a bit of wood or a can and put it into the right pot. Now they’ve. Obviously they said the biggest issue is vision because how can you tell if it’s a bit of wood or if it’s a painted piece of plastic? So I mean that, that’s something that that I suppose would fit almost more into the the other 20% because it’s quite niche and specialized. I mean, do you see how long do you think it would take for us to get to that level where we can mount a camera and be able to determine bits of rubbish as well as we can? Have you seen that tech? Yeah.

Dr Michael Suppa

So I think that’s more or less the classification problem. So that usually comes before the post detection. So what we are doing most of the time is the post, so what kind of object we have, where it is, so to say. And I think with a lot of the machine learning and so on, we are now able to classify more and more parts because what the machine learning does, it basically is classification. So basically you have a lot of parameters you can train that’s something that you do want to do this manually for these large amount of objects. So to say, that’s very hard to do. And I think with machine learning and the databases, that’s the way to go there. And I think we are already quite far with that just by classification of the parts. Right, so basically you’re having some data using that in a data driven approach and classify. I think that’s the way to go here and it determines a little bit on how big our data sets are that we have from that. And saying that usually robotics is still small data as opposed to autonomous driving and so on, way of like large data sets and that’s a bit of the problem of robotics. I would say that we still deal with small amount of data, so we have to focus on the right data. So it’s not like that. We can always say, hey, we are taking like two years of recording data. That’s actually what autonomous driving did. So the first data sets are ten years old now and they’re still working on them for robotic application. We don’t have the time. So I think we have to focus a little bit now. What is the right data for that specific case, train that and work on that. So I think that’s really key in order to be successful.

Philip English

As you said, you want to hit that percentage of jobs that are happening now and that need that technology that’s really good. What’s the sort of the future that you see? Are you looking to bring out some new products? I suppose as technology evolves and the actual product evolves as well. What’s the sort of like a rough road map for you guys?

 

Dr Michael Suppa

Yeah. So we have really good experience by now by using models and generate synthetic data to train our neural networks for known parts. And this helped a lot to get ground truth data and to make reduce really the onsite recording time to a minimum. This is for like now an area where we have model data and we want to expand that also to more unknown parts as I mentioned in the beginning. Maybe also transfer in part to get reduce that to a good data set that we can use with ground truth and so on. So that’s one thing that we are aiming for second is on the hardware side to also expand a little bit the portfolio. So also integrate maybe different sensing principles like time of light and so on to see a wider scope of the application area and then transfer the products from mostly industry automation and logistics now also to lab automation like more service robotics because we are starting from a very let’s say structured way where the robot operates to a less structured way. And I think that’s one of the goals that we want to go for to enable the 3D perception for less constrained environments, so more unknown cases. And this is basically this relates to vision skills you may call it, which we want to implement the future.

Philip English

Yeah, well, this is it. I mean, I suppose the next sort of big question I have then is that I suppose that could potentially link to like artificial intelligence and I know there’s been many sort of dates that are thrown around of when we’re going to get to that general intelligence. For yourself, obviously I’ve heard dates of 2030, I heard dates of 2050, I’ve heard dates of never. Do you guys have any, have you played with any sort of like machine learning artificial intelligence or have you got any like predictions for yourself or when you can actually plug in some sort of intelligence that will actually recognize the object straight away.

Dr Michael Suppa

So at the moment we are that we are mostly using the machine learning part of our things. So basically for two reasons. One is ease of use because parameters tag usually if you want to configure a visual system you have a lot of parameters. If you do this by machine learning, that’s done by the neural network. So you have like a OneClick solution and that’s quite helpful on that end. So we are using the machine learning actually as a tool to basically have an ease of use and easier access of our customers to a complex technology. That’s what I also see as a trend because actually machine learning and AI is a tool, right? So it must serve a purpose and must make things easier for the application side otherwise it doesn’t bring any value. So this is what we are seeing at the moment. And the second thing that I see is a big issue that relates more to artificial intelligence is this kind of supervision system. So basically when you have a robot production there’s some kind of error handling that always needs to take place or an overview. At the moment most machines are then stopped, someone goes in, fixes the error and so on. So what we want to have for the future is that we have a supervision that the system notices what’s going on and adapts the right strategy to solve the problem. And that’s still something that I would see in combination of machine learning, artificial intelligence to have less stopping of machines and the same applies and also to let’s say service robots where for sure decision making is needed for detection but also for determination on the next action to take. Because in industrial automation actions are more or less fixed but service robotics have a large scope of potential next action and the semantics basically under basically the background what needs to be done is deciding on what’s happening next. And I think that’s the biggest challenge now to figure out what’s the next action to take based on what you see and how you separate the environment and therefore sure, artificial intelligence. And then the beginning, machine learning would play a big important role in that size.

Philip English

Yes. Have you ever had any food applications? Because obviously you’re speaking about service robots there and I know that the big food companies are looking at robots, robots that can make food and all that type of technology. It seems to be growing right at the moment. No one’s cracked it yet and I’ve seen some robot arms making food. But could you I suppose, could it be used for a food type of application?

Dr Michael Suppa

Yeah. So actually was in touch with some of these companies that are doing this, let’s say food cooking or food promotion robots so far, I guess most of them are doing it a pretty classical way. You have a recipe and you bring just parts together, stir it around, but I think that depends a little on the most of them are doing bowls. Right. So this is why this is a bit simpler to start with, but as soon as you’re going to more complex food that require a bit of, like, turning meat or something like that, where you have more detection areas. Where you have to do it in a certain way and after a certain time, or at a certain level, you need to see if it’s a meat good or not. Something like that. So then I would say that the vision will play an important role. So at the moment they’re going for the bowls because I think that from the vision side, less complex. Yes. And burgers are also quite simple because they also have like a very clear pattern how that should be done. But the more the variety gets bigger to more complex dishes, then three division and machine learning will surely be playing an important role not only in the cooking process, but also in the quality control to make sure that the stuff goes out in the right way. Because customers are not happy when they are ordered medium ray and they get as well done.

Philip English

That’s right, yeah. Well, it’s like said I saw in the website, I would see you say, so it’s the eyes and brain of the robot and that’s it. Especially as a service robot. Industry like evolves, then? Yeah, it’s going to be key, really. I suppose the only last question I had just conscious of time is can you use them for any form of safety? Obviously I’m thinking about laser scanners here. You hit laser scanner, machine Switches off. Is there a similar sort of thing that you can use for a safety purpose?

Dr Michael Suppa

Yeah, so on the stereo side, the safety is a bit more difficult because it’s a passive system. So laser is active, you’re sending a signal out, you’re getting it back and you can do this differentiation by that you have like two channels for it and so on. On the stereo side, we have something like this that we call confidence. So basically we have an estimation on how accurate and how good our algorithm works. So that’s a baseline for getting safety. But from the reception point of view, we have not started that because it’s quite a long term qualification process for companies such as ours. So that’s something we can do with partners. So in principle it’s possible we have just not done it because in many of the applications that we see so far, it was not needed. It’s becoming more relevant now with mobile robots being more and more in the world and more and more needing to see more than this ten centimeter above ground laser line, so that it becomes much more relevant. And the demand is there more. And I think there is a bit of a challenge in between time of light and stereo. Time of light is. Has a bit of an advantage because it’s active as well. So you can use a similar principle than to use them with a laser. But I think that’s still to be decided. Which one will be the, let’s say three D safety sensor with a larger field of use. And so it will not be laser. It could be also radar or combination of radar and camera, something like this. So there’s this is like an active, let’s say, process at the moment, not decided yet what’s going to be.

Philip English

But this is it. It’s interesting because obviously at the moment, a robot may have multiple systems on them. But as you’re saying, as the technology comes along, if you can use the same piece of kit for various different types of sensors and applications on the robot, then that’s perfect. I’m thinking of, like, I know Amazon bought a company, it must have been about a year ago now, where it was like a mobile platform company, but they just had a camera that the whole platform works on a camera. I have to dig it up, I’ll put it in the notes to find it. So that would be interesting to see because obviously they must have some sort of safety within that camera as well. So, yeah, I’ll do a bit of digging.

Dr Michael Suppa

All that work with bumpers on top of that. Basically, you have like a physical textile sensor around the platform or something like this. Could also be the same as it is with vacuum cleaners and so on. They also have a camera, but not for safety purposes, but for navigation. But it’s interesting to see. So, I mean, for autonomous driving, if you listen to Elon Musk, he says that the camera is our sensor, but it will always be a combination of a camera plus X, I would say, for the beginning. So camera plus, laser plus, laser camera plus radar camera plus because data fusion is actually the best safety principle in the world because it combines two modalities, at least. This is one of the ways you could also do it, combining two cheap sensors as opposed to an expensive laser ladder system or something like that. Yeah.

Philip English

Okay. Well, that’s been great. Yeah, that’s a great overview. Thanks, Michael. I think the last steps is really to see if people want to get in contact with you. Obviously, I’m guessing, like, reach out to the robot section website, give the guys a call if there’s any interest there. Right. No. Thanks, Michael. Well, that’s been a great interview. And yeah, I was saying, if you want to get in contact, then we know where to go and yeah, thank you very much for your time. Very much appreciated.

Dr Michael Suppa

Thanks a lot. And also thanks for the great interview, Philip. So I really enjoyed it and looking forward to getting a lot of requests and questions, the community and yeah, I hope I generated some interest on CD vision machine learning and ISO range. The robot.

Philip English

Thanks, Michael. Thank you.

Dr Michael Suppa

You’re welcome.

You might be interested in …

Leave a Reply

Your email address will not be published.