Messaging is a common mode of communication, with conversations written informally between individuals. Interpreting emotional affect from messaging data can lead to a powerful form of reflection or act as a support for clinical therapy. Existing analysis techniques for social media commonly use LIWC and VADER for automated sentiment estimation. We correlate LIWC, VADER, and ratings from human reviewers with affect scores from 25 participants. We explore differences in how and when each technique is successful. Results show that human review does better than VADER, the best automated technique, when humans are judging positive affect ($r_s=0.45$ correlation when confident, $r_s=0.30$ overall). Surprisingly, human reviewers only do slightly better than VADER when judging negative affect ($r_s=0.38$ correlation when confident, $r_s=0.29$ overall). Compared to prior literature, VADER correlates more closely with PANAS scores for private messaging than public social media. Our results indicate that while any technique that serves as a proxy for PANAS scores has moderate correlation at best, there are some areas to improve the automated techniques by better considering context and timing in conversations.