The problem of estimating speaker location in smart conference video system is considered. In this paper, a fourmicrophone cross array is constructed to localize speakers with Time Difference of Arrival (TDoA) measurements based on hyperbola model. The time delay was calculated using Generalized Cross Correlation (GCC)algorithm. A practical test system was build to confirm the feasibility of the hyperbola model and GCC algorithm. Data were collected in field experiments and calculated on PC by matlab. The results show that the method instructed in this paper is feasible in localizing the speaker.