If you can learn a foreign language, why not AI. Translation does not rely on grammar structure since these can be learnt. The attention model specifically designed to handle these dependencies in translation, which then led to development of AI in other tasks. You will be surprised.
For old translation systems, you are absolutely right though.
Because Japanese is a notoriously context-dependent language, AI cannot accurately translate even simple sentences like "Daijoubu desu". Its meaning varies wildly depending on the context, such as "I'm fine", "He was okay", "We'll make it", "You can count on this", "No thank you", etc.
And context isn't always clear. It depends on where the conversation is taking place, where the speaker is looking, where speech bubbles are positioned, and if the author intends to mislead readers, the information necessary to understand the context might even be revealed in future episodes.
For old translation systems, you are absolutely right though.