It doesn't "know" that the capital of Oregon is Salem. To take an extreme example, if everyone on the internet made up a lie that the capital of Oregon is another city, and we trained a model on that, it would respond with that information. The words "the capital of Oregon is Salem" do not imply that the LLM actually knows that information. It's just that Salem statistically most frequently appears as the capital of Oregon in written language.