Are the state-action values and the state value function equivalent for a given policy?

Are the state-action values and the state value function equivalent for a given policy? I would assume so as the value function is defined as $V(s)=\sum_a \pi(a|s)Q_{\pi}(s,a)$. If we are operating a greedy policy and hence acting optimally, doesn't this mean that in fact the policy is deterministic and then $\pi(a|s)$ is $1$ for the optimal action and $0$ for all others? Would this then lead to an equivalence between the two?

Here is my work to formulate some form of proof where I start with the idea that a policy is defined to be better than a current policy if for all states then $Q_{\pi}(S,\pi^∗(s))\geq Vπ_{\pi}(s)$ :

I iteratively apply the optimal policy to each time step until I eventually get to a fully optimal time step of rewards

$$Vπ_{\pi}(s)≤Q_{\pi}(S,\pi^∗(s))$$$$=Eπ[R_{t+1}+\gamma V_{\pi}(St+1)|St=s]$$$$\leq E[Rt+1+\gamma Q_{\pi}(S_{t+1},\pi^∗(S_{t+1})|S_t=s]$$$$\leq E[Rt+1+\gamma Rt+2+\gamma 2Q \pi^*(S_{t+2},\pi^∗(S_{t+2})|S_t=s]$$$$\leq E[R_{t+1}+\gamma R_{t+2}+....|S_t=s]$$$$=V\pi^∗(s)$$

I would say that our final two lines are in fact inequalities, and for me this makes intuitive sense in that if we are always taking a deterministic greedy action our value function and Q function are the same. As detailed here, for a given policy and state we have that $V(s)=\sum_a \pi(a|s)Q_{\pi}(s,a)$ and if the policy is optimal and hence greedy then $\pi(a|s)$ is deterministic.

Are the state-action values and the state value function equivalent for a given policy?

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List