Personal health literacy is an important indicator for a national health status. Providing citizens with sufficient medical knowledge can help them understand their own health conditions. To achieve this goal, an integrated system is developed for evaluating the readability of healthcare documents by taking heart disease as a specific topic. The mechanism can be extended to other target diseases and languages by changing the corresponding word databank. The assessment system for examining document readability is based on patient-oriented aspects rather than professional aspects. Commonly used terms and professional medical terms extracted from a query document were utilized as fundamental elements for readability analysis, and the derived features included term frequency of professional medical terms, proportion of professional medical terms, and diversity indicator of medical terms. A five-fold cross validation is applied to measure the robustness of the proposed approach. The experimental results achieved a recall rate of 0.93, a precision rate of 0.97, and an accuracy rate of 0.95.
International Journal of Data Warehousing and Mining 16(1), p.63-72