Gamespeak is indeed as simple as having a system of sending a signal to all entities within a certain radius of the player and having them respond with a certain behavior.
But how does that behavior work? Coding a slig patrol might seem simple enough, but making the slig responding to anything dynamic intelligently would require some good 3D pathfinding that I think is probably a bit beyond an AI novice such as myself. Having characters navigate that extra dimension adds a whole new layer of complexity. Good, dynamic 2D AI is hard enough.
I could see what tools Unity gives me to make this easier and I'd certainly give it a go, but it's a big ask for me