We consider the problem of learning linear classifiers when both features and labels are binary. In addition, the features are noisy, i.e., they could be flipped with an unknown probability. In Sy-De attribute noise model, where all features could be noisy together with same probability, we show that 0-1 loss (l0−1) need not be robust but a popular surrogate, squared loss (lsq) is. In Asy-In attribute noise model, we prove that l0−1 is robust for any distribution over 2 dimensional feature space. However, due to computational intractability of l0−1, we resort to lsq and observe that it need not be Asy-In noise robust. Our empirical results support Sy-De robustness of squared loss for low to moderate noise rates.