Controller seemed fairly straightforward to me initially, when I was first learning Smalltalk (ParcPlace), and I took my simple understanding on faith.
My programs were simple, so M was data, V was presentations of the data, C was interaction on the M and maybe V.
M is the Model. That means the data and all the things you might ever want to do with the data. So any interaction you might want to do from the view is (ideally) a single message-send to the model.
> V was presentations of the data
And editing the data.
> C was interaction on the M and maybe V.
> It only got confusing when I got more experience.
My programs were simple, so M was data, V was presentations of the data, C was interaction on the M and maybe V.
It only got confusing when I got more experience.