Clickbaits are routinely utilized by online publishers to attract the attention of people in competitive media markets. Clickbaits are increasingly used in visual-centric social media but remain a largely unexplored problem. Existing defense mechanisms rely on text-based features and are thus inapplicable to visual social media. By exploring the relationships between images and text, we develop a novel approach to characterize clickbaits on visual social media. Focusing on the topic of fashion, we first examined the prevalence of clickbaits on Instagram and surveyed their negative impacts on user experience through a focus group study (N=31). In a largescale analysis, we collected 450,000 Instagram posts and manually labeled 12,659 of these posts to determine what people consider to be clickbaits. By combining three different types of features (e.g., image, text, and meta features), our classifier was able detect clickbaits with an accuracy of 0.863. We performed an extensive feature analysis and showed that content-based features are much more important than meta features (e.g., number of followers) in clickbait classification. Our analysis indicates that approximately 11% of fashion-related Instagram posts are clickbait and that these posts are consistently accompanied by many hashtags, thus demonstrating that clickbait is prevalent in visual social media.