It's well known that when LLMs 'think through' something before speaking, it improves the quality of responses.
To facilitate this if you could let us specify to speak only the text within specified XML tags, then we could instruct the LLM to output e.g.:
<thinking>
step 1
step 2
</thinking>
<response>
This is my response
</response>
And speak only:
This is my response