简述:安装Pip
Mac:命令行敲入sudo easy_install pip,输入Mac密码,等待片刻
命令行:sudo easy_install pipPassword:Searching for pipReading https://pypi.python.org/simple/pip/...复制代码
##一、使用requests和BeautifulSoup进行爬虫
# -*- coding: UTF-8 -*-import requestsfrom bs4 import BeautifulSouphtml = requests.get("http://www.jianshu.com/") #拉取指定网站soup = BeautifulSoup(html.content, 'html.parser') #运用BeautifulSoup解析返回的网页源代码,并且指定解析器for item in soup.select(".content"): #查找Html中class = content的标签,返回此标签的列表 # print item.select(".avatar")[0] name = item.select(".blue-link")[0].text title = item.select(".title")[0].text content = item.select(".abstract")[0].text time = item.select(".time")[0]["data-shared-at"] print "名字:",name print "标题:",title print "内容:",content print "时间:",time复制代码##二、数据库操作 数据库驱动:https://dev.mysql.com/downloads/connector/python/ ```
-- coding: UTF-8 --
import mysql.connector
config = { 'user': 'root', 'password': 'root', 'host': '127.0.0.1', 'database': 'test', }
con = mysql.connector.connect(**config) cursor = con.cursor()
#增加 cursor.execute("insert into User values(null,%s,%s)",['haha','123']) row = cursor.rowcount ##返回影响的行数 print row # 1
#查询 cursor = con.cursor() cursor.execute("select * from User") fetchall = cursor.fetchall() print fetchall # [(1, u'junwen', u'123'), (2, u'junwen', u'123'), (3, u'junwen', u'123')]
cursor.close() con.close()
##三、Splinter测试工具,能够网页自动执行复制代码
pip install splinter pip install selenium
**下载chromedriver.exe 和 geckodriver.exe 分别加入环境变量,路径不要加上.exe文件**chromedriver : http://download.csdn.net/download/qianaier/7966945 http://download.csdn.net/download/anan_ss/9723479geckodriver:https://github.com/mozilla/geckodriver/releases/添加环境后,再把chromedriver .exe放入你要执行的.py目录中复制代码
coding=utf-8
from splinter.browser import Browser xx = Browser(driver_name="chrome") xx.visit("http://item.jd.com/2707976.html") print xx.title #页面标题 : 京东... print xx.driver_name ##浏览器名称:chrome print xx.url #当前页面的Url地址 xx.click_link_by_text("你好,请登录") #点击text是后面文件本 xx.click_link_by_text("账户登录") xx.fill("loginname","18695604770") #填充数据根据name xx.fill("nloginpwd","yao20100814") xx.click_link_by_id("loginsubmit")
![](http://upload-images.jianshu.io/upload_images/2650372-c9ea3d4ed5533da7.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)##四、注意事项**一、 编码问题存在中文字符,再代码第一行加入 `# -*- coding: UTF-8 -*-`**![](http://upload-images.jianshu.io/upload_images/2650372-af37fb6e6b5c712d.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)**二、'module' object is not callable 原因分析**![](http://upload-images.jianshu.io/upload_images/2650372-008115d1065c3ddc.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)解决:原因分析:[Python](http://lib.csdn.net/base/python)导入模块的方法有两种:import module 和 from module import,区别是前者所有导入的东西使用时需加上模块名的限定,而后者不要。复制代码
正确的代码:
import Person person = Person.Person('dnawo','man') print person.Name 或
from Person import * person = Person('dnawo','man') print person.Name
**三 WindowsError: [Error 183] : 这是因为文件夹重名了**![](http://upload-images.jianshu.io/upload_images/2650372-1ee1d3bb30a77ff7.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)**四 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 0: ordinal not in range(128)**![](http://upload-images.jianshu.io/upload_images/2650372-a74e8a532df0bac8.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)解决:加入代码就可以了复制代码
import sys reload(sys) sys.setdefaultencoding('utf8')
##四、学习资料http://cuiqingcai.com/1319.htmlhttp://www.liaoxuefeng.com/wiki/001374738125095c955c1e6d8bb493182103fac9270762a000/001391435131816c6a377e100ec4d43b3fc9145f3bb8056000http://www.runoob.com/python/python-object.html复制代码