用于记录我线上练习时遇到的知识点,以供日后复习的时候对照书本查看
毛概与习概的复习思路都是先线上练习,记录知识点,期末的时候对照书本查看小题,同时背诵大题
复习的时候要注意可能的题目变形
1
2024年7月15至18日,二十届三中全会在北京举行。全会指出,到2029年,完成《决定》提出的改革任务。
2
二十届三中全会提出,要健全因地制宜发展新质生产力体制机制,推动技术革命性突破,生产要素创新性配置和产业深度转型升级。
(没有实体经济数字化,可以想想,确实实体经济难道一定要数字化吗?有的经济可能本身就不适合数字化…)
3
党的二十届三中全会指出,坚持系统观念处理好经济和社会、政府和市场、效率和公平、活力和秩序、发展和安全等重大关系,增强改革系统性、整体性、协同性。
(坚持系统观念可以帮助处理一系列重大关系)
4
二十届三中全会指出,推动生产关系和生产力、上层建筑和经济基础、治理体系和治理能力更好相适应,为中国式现代化提供强大动力和制度
保证。(错误)
5
中国式现代化的鲜明标志是开放(正确)
1
下列属于邓小平南方谈话终提出的重要论断的有
1)要提倡科学,靠科学才有希望
2)革命是解放生产力,改革也是解放生产力
3)“三个有利于”标准
4)社会主义可以搞市场经济
2
2000年的宪法修正案正式将邓小平理论载入宪法(错误)
3
党的十七大把科学发展观确立为党必须长期坚持的指导思想(错误)
4
邓小平在党的十二大上提出“走自己的道路,建设有中国特色的社会主义”的论断,中国特色社会主义成为了党的全部理论和实践创新主题。
5
新世纪新阶段,我国进入发展关键期、改革攻坚期和矛盾凸显期,经济社会发展呈现一系列新的阶段性特征。
(没有开放活跃期)
6
中国特色社会主义理论体系形成时期的时代主题是和平与发展
1
邓小平指出社会主义的本质是
1)解放生产力,发展生产力
2)消灭剥削,消除两极分化
3)最终达到共同富裕
没有以经济建设为中心
2
1978年12月召开的党的十一届三中全会,做出了以下重大决策
1)重新确立了解放思想、实事求是的思想路线
2)批评了“两个凡是”的错误方针
3)确定把党和国家的工作重点转移到社会主义现代化建设上来的战略决策
因为是“决策”,所以没有“提高了对科学社会主义的认识”
3
发展社会主义市场经济是当代中国的鲜明标志和活力源泉,是决定中国命运的关键一招(错误)
4
邓小平认为和平是东西问题,发展是南北问题(正确)
5
1997年党的十五大指出不发达的社会生产力水平是中国最大的实际(错)
6
党的十四大把建立社会主义市场经济体制作为我国经济体制改革的目标
7
在社会主义的依靠力量上,人民群众是我们党的力量源泉和胜利之本
8
为了更好实现“三步走”的现代化发展战略,邓小平提出了以重点带动全局的思想。1982年他提出了三个战略重点,包括了农业、能源和交通、教育和科学
(没有文化与技术、工业与金融)
9
以经济建设为中心是改变我国不发达现状的需要,也体现了社会主义奋斗精神。(错误)
(我国的现状不是不发达,人民日益增长的…)
1
贯彻“三个代表”重要思想,关键在坚持与时俱进
(或许核心在执政为民)
2
坚持中国共产党的领导,就是要
1)2)3)4)
3
社会主义的根本任务是发展社会生产力
(不是实现共同富裕)
4
建设社会主义政治文明,最根本的就是要坚持党的领导、人民当家作主和依法治国的有机统一。
5
江泽民强调,我们想做事,做工作,想得对不对,做得好不好,根本的衡量尺度就是
1)2)3)4) 人民…
6
党的领导是人民当家做主和依法治国的根本保证(正确)
7
推进党的建设新的伟大工程,重点是加强党的执政能力建设
8
始终做到“三个代表”,是我们党的立党之本、执政之基和力量之源
(没有兴国之要)
1
科学技术是先进生产力的集中体现和主要标志
2
以科学发展为主题,是时代的要求,关系改革开放和现代化建设全局
3
自主创新是科技发展的灵魂,是一个民族发展的不竭动力,是支撑国家崛起的筋骨
4
坚持统筹兼顾,妥善处理社会主义事业重大关系,包括
1)统筹城乡发展
2)统筹区域发展
3)统筹经济社会发展
4)统筹国内发展和对外开放
5)统筹人与自然和谐发展
5
扩大内需是我国经济社会发展的基本立足点和长期战略方针,也是调整经济结构的首要任务。
6
Read more -->建设生态文明关系人民福祉,关乎民族未来的长远大计
这篇blog用于记录,我在学习计算机系统工程导论,有关性能的章节时,阅读的一篇叫做MapReduce的论文
他们的性能目标是通过MapReduce实现大规模数据专用计算的自动并行化,使得缺少并行与分布式系统经验的程序员可以轻松利用大规模分布式系统资源,从而突破数据处理的时延这一性能瓶颈。
MapReduce程序主要有用户部分和库部分,前者根据用户的业务逻辑需要在后者的基础上进行编写,而相关的并行化、容错、本地优化和负载均衡的细节被隐藏在后者中。整个系统的工作流程是:MapReduce库分割输入文件为较小块,并启动程序运行于机器集群之上;主结点根据块的分割情况,将map、reduce任务分配给各个机器处理;map任务通过用户的MAP函数建立相应的键值对;reduce任务将map任务处理的结果进行排序从而组合相同键,再通过用户的Reduce进行处理。这样的程序系统可以运行于廉价的PC集群之上,可以显著减少时延。
基于此,Google使用MapReduce程序来进行自己业务要求的实现:重写了生成Google网页搜索服务所需数据结构的生产索引、生成流行查询报告的数据提取、用于新实验和产品的网页属性提取等等。
因为许多类型的问题都可以轻松地用MapReduce这种模型进行表示,从而被并行化地计算,所以其泛用性很好;基于这种模型的处理方式具有很强的可拓展性,考虑增加相应的工作结点,就可以用较低的代价换取较高的性能提升,通过合适的Map、Reduce函数也可以带来更加个性化的工作方式;这样的实现充分考虑了故障的可能,以及潜在的风险,并设计了相对应的处理方式,使得其可靠性较强,这一点从文献中提供的相关数据也可以看出。
Read more -->高斯混合聚类不同于k-means、LVQ利用原型向量刻画聚类结构,而是利用概率来刻画聚类结构。
简单来说,这种算法认为数据集中的每个样本都符合一个多元高斯分布(多元的原因是样本常是多元向量),如下
所有的样本共同符合“混合高斯分布”。混合高斯分布对应的概率密度函数是所有多元高斯分布密度函数的加权量。
若$x$服从多元高斯分布,对应概率密度函数为
$p(x) = \frac{1}{(2\pi)^{\frac{n}{2}}\lvert \Sigma \rvert^{\frac{1}{2}}}e^{-\frac{1}{2}(x-\mu)^{T}\Sigma^{-1}(x-\mu)}$,其中$x$是样本对应的向量,$\Sigma$是协方差矩阵,$\mu$是期望
为了便于理解,参照一元高斯分布
$\frac{1}{\sqrt{2\pi}\sigma}e^{\frac{-1}{2}\frac{(x - \mu)^{2}}{\sigma^{2}}}$
协方差矩阵就对应方差、多元高斯分布中期望对应一元中的期望(只是多元高斯分布中期望是一个多维向量)
所以多元高斯分布的情况,由$\Sigma$和$\mu$唯一确定
将多元高斯分布密度函数记作$p(x|\mu,\Sigma)$
可以定义混合高斯分布如下:
若$x$服从混合高斯分布,整个样本空间对应有k种多元高斯分布,对应概率密度函数$p_M = \sum^{k}{i=1} \alpha_i p(x|\mu_i,\Sigma_i)$,其中$\alpha$是混合系数,$\alpha_i$对应的实际意义是,选择第$i$个混合成分的概率,所以有$\alpha_i > 0$且$\sum^{k}{i=1} \alpha_i = 1$;而$p(x|\mu_i,\Sigma_i)$对应第i个混合成分的概率密度
令数据集$D = {x_1,x_2,…,x_m}$随机变量$z_j \in {1,2,…,k}$,$z_j$表征样本$x_j$属于哪个混合成分。
对于$x_j$并不确定的情况下,$z_j$的先验分布为
$p(z_j = i) = \alpha_i, i = {1,2,…,k}$
根据贝叶斯定理,当$x_j$确定时,$z_j$的后验分布记作$p_M(z_j = i|x_j)$,为
$p_M(z_j = i|x_j) = \frac{p(z_j = i)p_M(x_j|z_j = i)}{p_M(x_j)}$
其中,$p(z_j = i) = \alpha_i$;$p_M(x_j|z_j = i) = p(x|\mu_i,\Sigma_i)$,因为当确定$z_j = i$时除了$\alpha_i = 1$其它$\alpha_1,…\alpha_i-1,\alpha_i+1,…,\alpha_k$均为$0$;而$p_M(x_j)$即混合高斯分布的概率密度函数
将$p_M(z_j = i|x_j)$简记作$\gamma_{ji}$,这个概率就是$x_j$的分布为$\mu_i,\Sigma_i$所对应的多元高斯分布的概率。
于是我们可以找到令$\gamma_{ji}$最大的$i \in {1,2,…,k}$,令$\lambda_j = argmax_{i \in {1,2,…,k}} \gamma_{ji}$,$\lambda_j$即$x_j$的标签。对于每个都进行相同的操作,整个数据集便可被划分为k个多元高斯分布对应的k个簇。
对于k-means、LVQ的模型训练实际上就是通过训练集获得对应的原型向量,有了原型向量便有了划分为簇的依据,也就完成了模型的训练。
而对于高斯混合聚类,根据前面的描述,我们划分为簇的重要依据就是$\gamma_{ji}$,进一步说实际上是计算$\gamma_{ji}$的依据,根据计算公式可知,这个依据实际上就是决定高斯混合分布的参数$\mu_i,\Sigma_i,\alpha_i, i \in {1,2,…,k}$。
于是训练的目的实际上就是要得到它们的值。
对于模型参数${(\alpha_i,\mu_i,\Sigma_i)|1 \le i \le k}$,我们采用极大似然的方法进行求解。
这里引用南瓜书中的一句话:
“对于每个样本$x_j$来说,它出现的概率是$p_M(x_j)$既然现在训练集D中确实出现了$x_j$,我们当然希望待求解的参数${(\alpha_i,\mu_i,\Sigma_i)|1 \le i \le k}$能够使这种可能性$p_M(x_j)$最大”
于是根据极大似然方法,我们令
$LL(D) = ln(\prod^{m}{j=1} p_M(x_j)) = \sum^{m}{j=1} ln(\sum^k_{i=1} \alpha_i \cdot p(x_j|\mu_i,\Sigma_i))$
为对数似然函数,将其最大化得到对应的参数,就是我们要求解的模型参数。
经过一系列数学运算,我们可以得得到如下结果:
$\mu_i = \frac{\sum^{m}_{j=1} \gamma_{ji} x_j}{\sum^{m}_{j=1} \gamma_{ji}}$
$\Sigma_i = \frac{\sum^{m}_{j=1} \gamma_{ji} (x_j - \mu_i)(x_j - \mu_i)^{T}}{\sum^{m}_{j=1}} \gamma_{ji}$
$\alpha_i = \frac{1}{m} \sum^{m}_{j=1} \gamma{ji}$
$i = 1,2,…,k$
注意结果当中,每一个参数的计算都要用到$\gamma{ji}$,而问题在于计算$\gamma{ji}$的时候又要用到参数本身,这似乎是一个循环的无从下手的问题。这种情况下,我们利用EM算法来求解。
EM算法(Expectation-Maximization)称为“期望最大化算法”,这种算法最开始应用于:使用极大似然法对模型参数进行估计,但是已知的样本中存在还没有“观测”的变量,这种变量称为隐变量,它的值是不确定的。
令$X,Z,\Theta$分别为已观测变量集、隐变量集、参数集,则应最大化对数似然$LL(\Theta|X,Z) = lnP(X,Z|\Theta)$,但是Z是隐变量,所以无法直接求解。
EM算法可以用于估计隐变量,并在这个过程中对参数做最大似然估计。
其基本的思想是这样的:
不断重复上述两个过程——迭代,直到满足退出条件,例如$\Theta$收敛
贴一篇介绍EM算法的博客:
CSDN
结合EM算法,整个高斯混合聚类算法流程如下:
输入:
过程:
Data.py:数据集部分
1 | D = [ |
main.py:主函数部分
1 | import Gauss |
Guass.py:高斯混合聚类算法部分
1 | import numpy as np |
进行这一部分的时候让我最大的感悟是,在使用计算机进行数据处理的时候,很可能会出现数据的溢出问题,非常大、非常小的数都在计算机的表示范围之外,就会带来问题。例如在进行协方差矩阵更新的时候,有时它的行列式值虽然不是0,但是已经非常小了,计算机会默认其为0。同样,可能作为除数的数也是一样的,如果太小变为0就会发送除以0的错误。
这称为数值稳定性问题
因此代码中有一些涉及处理这些问题的地方(主要是在求第i个多元高斯分布中取得xj这个值的概率的时候,一处是正则化、一处是对数化并添加过小的判断),尽管我现在并不清楚这样处理是否有问题。但是抛开这些细节,作为一次上手的练习,整个混合高斯聚类算法是正确的。
不过值得一提的是,EM算法的收敛与迭代的次数没有必然的关系,通常应该使用参数变化的情况来决定是否结束迭代
最后的结果如下:
高斯混合聚类 的结果
2
[[0.608, 0.318], [0.359, 0.188]]
0
[]
28
[[0.697, 0.46], [0.774, 0.376], [0.634, 0.264], [0.556, 0.215], [0.403, 0.237], [0.481, 0.149], [0.437, 0.211], [0.666, 0.091], [0.243, 0.267], [0.245, 0.057], [0.343, 0.099], [0.639, 0.161], [0.657, 0.198], [0.36, 0.37], [0.593, 0.042], [0.719, 0.103], [0.339, 0.241], [0.282, 0.257], [0.748, 0.232], [0.714, 0.346], [0.483, 0.312], [0.478, 0.437], [0.525, 0.369], [0.751, 0.489], [0.532, 0.472], [0.473, 0.376], [0.725, 0.445], [0.446, 0.459]]
经过我的观察,第3次迭代之后划分结果基本上就稳定了,第1次的时候分得比较均匀。这或许是数据的原因。(也有可能是我对于数值稳定性的处理不好)
如果将正则化项改大一些结果如下:
Read more -->高斯混合聚类 的结果
12
[[0.608, 0.318], [0.556, 0.215], [0.403, 0.237], [0.481, 0.149], [0.437, 0.211], [0.243, 0.267], [0.245, 0.057], [0.343, 0.099], [0.359, 0.188], [0.339, 0.241], [0.282, 0.257], [0.483, 0.312]]
10
[[0.634, 0.264], [0.666, 0.091], [0.639, 0.161], [0.657, 0.198], [0.593, 0.042], [0.719, 0.103], [0.748, 0.232], [0.714, 0.346], [0.751, 0.489], [0.725, 0.445]]
8
[[0.697, 0.46], [0.774, 0.376], [0.36, 0.37], [0.478, 0.437], [0.525, 0.369], [0.532, 0.472], [0.473, 0.376], [0.446, 0.459]]
这篇blog用于记录我使用python对学习向量量化这种聚类算法的复现
学习向量量化也成为LVQ(Learning Vector Quantization),同样属于原型聚类算法,类似于k-means通过希望划分的簇的数量求得相同数量的“簇中心”并以此为原型将数据集划分为对应的簇,LVQ通过求得与希望划分的簇数量相同的“原型向量”,并以此来将数据集划分为对应的簇。
如果说k-means也同样是借助原型向量的话,那么关键就在于两种算法更新原型向量的方法不同。k-means是不断的用原型向量划分簇,又用簇更新原型向量;LVQ则是利用样本的预先标注作为“监督信息”,不断利用样本更新原型向量。
整个算法的大致流程如下:
输入: $D = {(x_1,y_1),(x_2,y_2),…,(x_m,y_m)}, q, {t_1,t_2,…,t_q},\eta \in (0,1)$其中,D是带有标记的数据集,q是原型向量个数,$t_i,i \in {1,2,…,q}$ 对应原型向量的标记,$\eta$是学习率
算法过程:
整个算法过程的关键就在于5、6,实际上相当于找到距离原型向量最近的样本,如果是同标记的则将该原型向量向该样本“拉近”,如果是不同标记的则“推远”(因为对应的是同标记的在一个簇中可能性较大,不同标记在不同簇中可能性较大)
关于7的退出条件,通常可以设置一个最大迭代轮数,或者是原型向量的更新程度已经小于了一个阈值。
使用python对西瓜书上的示例复现代码如下(30个样本,9-21号样本标记为2,其它样本标记为1,随机选择5个样本作为原始向量,标记分别为1、2、2、1、1,学习率为0.1)
Data.py数据集部分:
1 | D = [ |
main.py主函数部分:
1 | import K_means |
LVQ.py学习向量量化算法部分:
1 | def lvq(D:[[float]], T:[int], q_vect:[]): |
捉个一个虫,py中万物皆对象,在初始化q_vec的时候直接将数据集D中的元素append进去,实际上共享了内存,这会导致更新原型向量的时候,数据集中的元素被更新,解决方案是使用深拷贝
修改方法:
将q_vec初始化的地方:
1 | q_vec = [] # 原型向量 |
更改为:
1 | q_vec = [x.copy() for x in (D[i] for i in q_vec_index)] # 关键修复点:使用列表拷贝 |
最终运行结果:
3
[0.36, 0.37] [0.478, 0.437] [0.446, 0.459]
8
[0.634, 0.264] [0.556, 0.215] [0.666, 0.091] [0.639, 0.161] [0.657, 0.198] [0.593, 0.042] [0.719, 0.103] [0.748, 0.232]
9
[0.403, 0.237] [0.481, 0.149] [0.437, 0.211] [0.243, 0.267] [0.245, 0.057] [0.343, 0.099] [0.359, 0.188] [0.339, 0.241] [0.282, 0.257]
5
[0.697, 0.46] [0.774, 0.376] [0.714, 0.346] [0.751, 0.489] [0.725, 0.445]
5
[0.608, 0.318] [0.483, 0.312] [0.525, 0.369] [0.532, 0.472] [0.473, 0.376]
对应为5个簇中样本的数量和相应的样本
注:理论上应该将数据集D划分为训练集和测试集,通过训练集训练模型(得到所有合理的原型向量),然后利用测试集测试,利用原型向量将测试集划分为对应数量的簇;这样才能完整地体现“机器学习”,但是这里只是一个简单的例子,将数据集D全用作训练集,得到原型向量,再把整个训练集划分为了对应的簇。
有了原型向量之后,划分为簇是很简单的,样本距离哪个原型向量最近,就纳入对应的簇即可。
这篇博客用于记录我在备考英语六级时所做的词汇准备,目前我已经完成了核心词汇的记忆,我会在这里为这些词汇补充一些例句。
目前的打算是每天20个词语,此外我会开始着手阅读、听力以及写译的准备,这篇blog中也会用于记录相应的内容
1
stabilize “How can I stabilize the colour of our love, my dear.”
2
manipulate “Your mean is that you can manipualte such a monster machine.”
3
ambiguous “Don’t treat me with such a ambiguous attitude.”
4
interaction “The relationship forms from the interaction among people little by little.”
5
perception “What’s your perception about our future?”
6
strive “I’ll strive to achieve our happiness in future.”
7
expense “How much is the expense of our lives in the month?”
8
cater “We will cater for the party.”
9
summon “I am trying to summon up the courage to love you.” “He was summoning the elevator.”
10
plug “The broken bus pluged the traffic.”
11
elaspe “I don’t think our love will elapse.”
12
authorize “The law won’t autorize anybody to kill.”
13
commentary “With a conception about freedom of expression, the commentraies online are too cynic and dirty sometimes.”
14
conservative “Some people are still conservative about sex, which’s treasurable.”
15
arson “People would arson to destory all the thing of a witch even her body in the middle centery, however many women called wicth are innocent.”
16
litre “How much water should a people to drink a day? 2 litres?”
17
vengeance “The lifelong traveling of Gess is to make vengeane.”
18
expenditure “He is rigorous in his contorl of expenditure.”
19
overwhelm “The fear overwhelmed in his mind.”
20
surpass “His achievement supasses anybody.”
1
visualize “If you can visualize your success, you’ll succeed.”
2
tension “For several hours tension mounted.”
3
shrewd “She is a shrewd bussinesswoman indeed, however she is a kind mother as well.”
4
phase “For our aim of happiness, we are staying the phase of adapating to each other.”
5
abject “If you are down to abject poverty, you actually get everything.”
6
intense “The fight between Gess and Geri is intense.”
7
recruit “His dream is to be recruited as a chairman by Tiktok.”
8
abolish “The princess will abolish the man’s title of knight, granting a new title of lover to him.”
9
reluctant “I don’t want your hug because you are reluctant.”
10
vacant “The room is vacant so you can use it.”
11
priest “My priest, listen to my confession. Please.”
12
transparent “My heart is just like a transparent river.”
13
hamper “Why did you hamper me forever?”
14
rust “The blank-red rust grow on the sword, saying the pride from the past.”
15
sulphur “Whenever I see the word sulphur, I think of my chemistry teacher Mr.Fu.”
16
vivid “The birds on the picture is vivid.”
17
intensive “You need to keep intensive when you are taking exam.”
18
diagnose “The doctor diagnosed him, saying he just need to take a rest.”
19
convict “God will convict you!”
20
implement “When will we implement this law?”
1
vertical “The wall is vertical.”
2
clinical “Clinical medicine is a difficuilt subject.”
3
thrill “When she heared sound like this, she thrilled from the bottom of feet to the upper head.”
4
paw “Don’t touch the paw of a cat.”
5
conscience “Where are your conscience?”
6
scope “This is a scope so you should scope this question to find where is the key.”
7
excceed “His power excceed a lot.”
8
indicative “How can you speak out with such a indicative tone.”
9
mayor “The mayor of the city is a lion.”
10
rectify “Should I rectify my teeth?”
11
commemorate “People made a sculpture to commemorate this hero.”
12
nourish “We are noursihed by the land we stand.”
13
lease “I make the lease so I lease the house.”
14
flee “She fleed away and will never go back.”
15
criterion “Who makes the criterion, whose talking work.”
16
credentials “I spend lots of time in my life to study so I got this credentials finally.”
17
notable “It’s not a notable thing.”
18
secure “Please secure that the ladder is secure so I can feel secure.”
19
gear “There is a little gear which is broken so the forth gear of the car can’t be used.”
20
axis “Fllow the axis so you can draw anything accurately.”
1
filter “If you want a cup of clear water, you could use the filter.”
2
confine “Your imagination will confine your arrival at the further place.”
3
rumour “If you really love her, you shouldn’t believe the rumour.”
4
category “There are some categories of people, which means that good people and bad people all exist.”
5
paradox “Sometimes I think I am the paradox itself, which hurts her a lot.”
6
transistor “Do you konw about transister? It sounds like some high technology.”
7
illusion “Give up your illusion and be ready to fight!”
8
ingredient “I once heard that the ingredients of girl are sweet, flower and any other things of happiness.”
9
feasible “Your idea is exactly genius and feasible.”
10
petroleum “If you mention my monitor in high shcool, I will think of petroleum which has relation to her profession.”
11
revenue “How much was the revenue of our government in the past year.”
12
transplant “If you want to make the flower alive, you can transplant it to another land to nourish it. If you want to make a people alive, you can give him a transplant.”
13
earnest “Please don’t hurt him, he is a earnest boy.”
14
instinct “Animals’ instinct will make them own the capablity to live in wild.”
15
resort “Whenever I listen to the word resort, I will think of people everywhere.”
16
solitude “Do you want solitude? Although sometimes you will feel lonely, you are free at all.”
17
notion “The notions about MTL is not easy.”
18
silicon “The silicon has a deep relation to the production of CPU.”
19
provoke “This funtion is used to provoke the next layer of GNN.”
20
intuition “You can’t do everything based on your intuition and you need to think independently sometimes.”
1
trial “Do you want to make a trial to chase her?”
2
dense “Someone will fear something tiny and tense.”
3
identical “There is no leaf identical to another one on the world.”
4
feminine “An expert said that sensitive psychology is feminine.”
5
tuition “The tuition from the teacher is fantastic but the tuition is expensive.”
6
empirical “I’m a empirical man and I’m not good at intution.”
7
predominant “A efficient algorithm should be predominant on the advantage of time or space.”
8
marvellous “Gess is a marvellous man with a life like legend.”
9
ignite “IGNITE!”
10
conquest “The conquest over her satisfy my hugury and thirsty.”
11
tenant “The tenant make lease with me yesterday.”
12
sphere “A sphere is an abstractive name of a kind of objects like basketball and football.”
13
negligible “You are actually negligible for me like a tiny sand swaying in the air.”
14
cynical “If you feel you are injured by the world, you can change it but never be a cynical people.”
15
tempt “The game is so tempting that he can’t focus on study.”
16
portion “Can you indentify these words:portion, composition, segment and part.”
17
competent “He is a competent man and he can be our leader.”
18
jury “She was finally commited by the lawyer and the jury.”
19
vibrate “The small tool is vibrating, making her thrilled constantly.”
20
session “They asked for the thrilling feeling from bottom of their feet to the upper head during the session of such a crazy activity.”
1
upright “The tower on the hill is upright.”
2
estate “A beautiful woman was killed in the estate lived by a lot of people.”
3
symphony “The symphony played by us sounds so smooth.”
4
medium “His ablity is medium among these people.”
5
commence “Let’s commence this job.”
6
exceptional “His advancement is exceptional.”
7
induce “You can induce him but not persude him.”
8
drought “This drought lasted 20 years, making no life here again.”
9
executive “He is recruited as the executive.”
10
initiative “She offers us a new initiative with the high initiative.”
11
riot “Tension occurred, society is turbulent and riots always appeared in everywhere.”
12
affirm “I can make an affirm that she isn’t a girl like that.”
13
outward “You can touch the outward texture. It’s so soft.”
14
denial “You can make a denial for me but I’ll never give up.”
15
faculty “The faculty refers to all the teacher of a colleage.”
16
supervise “We must make a principle to supervise officers.”
17
imperative “The task is so imperative that we must handle it now.”
18
mingle “If you mingle something, you have mixed them.”
19
obscure “Dying as a well-konwn hero or obscure as normal people, which one would you choose?”
20
forth “He paced back and forth.”
1
shuttle “Which shuttle wii you choose, if you want to go to the western coast. The plane shuttles to carry apples?”
2
personnel “Look at me! Personnel in attendance!”
3
intensify “What hurts my heart will intensify my power.”
4
turbulent “Intension occurred and society was turbulent.”
5
attendant “Two attandants of the KTV saw a monster and the anttendant disaseter.”
6
prospect “Do you think that we have the prospect of our future.”
7
texture “The texture of the cloth makes me feel comfortable.”
8
span “The ache of the wound is spanning and the span has been more than 12 hours.”
9
tragic “Your targic destiny makes you great.”
10
commodity “She is not a commodity and you should respect her.”
11
downfall “The reason of Osman’s downfall began from 2000 years ago.”
12
timber “We can use the timber to arson.”
13
accessory “I am not your accessory!”
14
ridiculous “You want to change her, a selflish woman, which is so ridiculous!”
15
textile “The country’s prosperity is based on the textile.”
16
composition “What’s the composition of the liquid.”
17
bribe “Don’t want to bribe him because he is a judge with justice.”
18
treaty “There are two treaties between me and Lele.”
19
archive “The archive showes us his experience of honor.”
20
retreat “Don’t retreat! We have guts, a very awesome man.”
1
dazzle “A dazzle dazzles my eyes.”
2
chronic “I’m so sorry to tell you that you have a chronic disease.”
3
compromise “I would never make a compromise on the view of love even for you.”
4
flourish “I hope that you can grow up well like plants flourishing under the sun.”
5
occupancy “The lasting occupancy of the computer results into the overflow of memory.”
6
vulnerable “Although his body is weak, his heart is never vulnerable.”
7
exclusive “For the president, the car is exclusive.”
8
favourable “Everybody holds a favourable attitude for the decision to recurit him as manager.”
9
spacecraft “The spacecraft is mainly made by element called Al.”
10
preserve “We are supposed to preserve the environment.”
11
analogy “How do you dare to make an analogy between a poor guy and the king.”
12
orient “You must orient, if you wanna to drive a ship on the ocean.”
13
warehouse “You can pile your goods in the warehouse.”
14
testify “I can testify that he is a good man.”
15
petty “Lele is actually a petty girl.”
16
cargo “Cargo is another name of goods.”
17
reckon “Gery reckons as a hero.” 一个本就含有被动语态的词语
18
bind “Binding a name to a object is the begining of artifical intelligence.”
19
expertise “The management of stack is the expertise of a CS students.”
20
conform “I won’t conform you without my principle.”
1
ego “If you can feel your own ego, you will acquire your real happiness.”
2
deficiency “The deficiency of your ego results in the regret of your life.”
3
assemble “We assemble something to get a set.”
4
punch “He was punched to death.”
5
irritate “Don’t irritate him otherwise he would kill you.”
6
patch “The patch on the pant makes her so sexy.”
7
prompt “A prompt prompt prompts him to give a right answer.”
8
oblige “I wanna to kiss you but I won’t oblige you until you are relunctant.”
9
correspondent “The correspondents are essential because they can deliver the information concerning the battle.”
10
flock “A flock of flock follow the God they trust.”
11
ponder “Confronted with this problem he pondered all the night.”
12
provision “The provision of food comfrot us a lot.”
13
stem “The stem of the flower is so thin.”
14
anticipate “I have anticipated that you might leave me one day but actually we will never leave each other during the life.”
15
threshold “The Forward Phase is just a threshold of your lengend jounery to pary for Elden Ring.”
16
expire “Don’t be unforgettable about your youngest which had expired and be concertrated on present.”
17
profess “I can’t believe you still profess that you love me even that you betray me.”
18
severe “She made such a severe worry and he won’t forgive her.”
19
ornament “We make quantities of bright ornaments to ornament our shop.”
20
germ “On the place you can’t see do exist lots of germs.”
1
reconcile “Do you think that I can’t reconcile myself to the prospect of losing you?”
2
transit “The car to transit money is protected carefully.”
3
whereas “I don’t want to leave you whereas we can’t reconcile our contradiction.”
4
prominent “Success in this competition is a prominent achievement.”
5
mount “You can mount the file system or mount a horse to go out!”
6
naval “The naval power of the country is strengthful.”
7
waist “I wanna to hold your waist.”
8
academy “Don’t look down upon a student of the academy, especially in Germany.”
9
isolate “She is isolated by all of them.”
10
acute “This is an acute problem and you should note it in details.”
11
feeble “Though he is phsically feeble, his soul is strong.”
12
pledge “You have pledged to me that you won’t leave me forever.”
13
pension “Don’t use the pension of your parents otherwise you would really make me disappointed.”
14
terminate “Could we terminate this crazy plan?”
15
assert “I can assert that I won’t make you lose!”
16
conceal “Where do you want to conceal yourself? No matter where you go to, I will find you finally.”
17
skim “If you just skim what I have said or what I have done, you will never really know me.”
18
authentic “I don’t know what the authentic character of me is because I am a people full of paradox.”
19
presumably “Presumably, she has never produced heartbeat for you so she can leave you easily.”
20
masculine “Although you just like a puppy confronted with her, I know you are actually masculine.”
1
applaud “Everyone at presence applauds for what he has done.”
2
intrigue “Such a strange thing intrigues him who is curious.”
3
delicate “You are supposed to treat her carefully because her heart is delicate.”
4
consolidate “What I have gotten could consolidate my confidence.”o
5
cable “Stay away from the cable otherwise you will get an electric shock.”
6
manifest “This is a manifest thing that manifests your loss!”
7
catastrophe “Looking at the broken world makes me know what a real catastrophe is.”
8
offensive “I’m so sorry that what I had said is offensive.”
10
dwell “Where your body dwells in, where your heart is then you will feel happy.”
11
eliminate “Why do you always want to eliminate who holds a different idea? You won’t be welcome.”
12
grief “Sometimes leaving from you would result in grief of ours but we all know that it’s the best anwser.”
13
concede “I’m relieved that you’d like to concede the wrong made by you.”
14
stagger “He is drunk deeply and he is just staggering.”
15
colonial “Our country will never be your colonial land!”
16
liquor “Liquor is a liquid will make you drunk.”
17
negotiation “The negotiation will be held concerning the theme of convicting the convict.”
18
peculiar “I’m peculiar sometimes because I am the parodox itself.”
19
confess “Your confessing will make us relieved.”
20
hoist “People will hoist a witch.”
1
breadth “Breadth means a big width.”
2
convene “We will convene a class meeting after this class.”
3
tremendous “What a tremendous script.”
4
toss “You make make a toss to decide which one to be tossed.”
5
utter “He utters this speaking with an utter justice.”
6
flap “Don’t use the flap to flap your belly.”
7
notorious “Don’t stay with her otherwise you will be notorious.”
8
uphold “Everyone should uphold the pride of constitution.”
9
cruise “This period of time of cruise makes me unforgettable, crusing on the sea and feeling the wind.”
10
overlap “You are supposed to notice the overlap among these books, which is important.”
11
scrape “Stop scraping the blackboard. We can’t stand it!”
12
insult “Someone is just cheap to like being insulted.”
13
permeate “Don’t let the virus permeating your blood or you will die.”
14
liability “Liability is responsibility from some aspects.”
15
comply “complying somebody means you conform him or her.”
16
harsh “Sometimes the reallity is harsh.”
17
stereotype “Nowadays Gaokao in China has become a kind of stereotype.”
18
generalize “If you can generalize this paper, you will understand it more deeply.”
19
defect “Her defects are that she is stingy sometimes.”
20
scheme “I want to see a scheme to explain your plans.”
1
folklore “The folklore in China is that two people who want to get married with each other should ask the suggestions of their parents.”
2
drown “The huge and grand water will drown us.”
3
loose “Don’t touch it! It’s about to loose.”
4
transaction “If you make a deal, you would make a transaction.”
5
trivial “Trivial thing is something negligible.”
6
proportion “This proportion of our enterprise is belong to Mr.L, a beautiful lady.”
7
concrete “The concrete is concrete thing which isn’t abstract.”
8
confidential “We all are supposed to protect the confidential material of our country.”
9
portray “Can you portray the beautiful view of the spring.”
10
additive “We don’t want the food with lots of additive.”
11
flip “When you toss the coin is flipped to the air.”
12
furnish “We can furnish you with some desks and chairs to furnish your home.”
13
suspicion “I don’t like your suspicion even that I can understand you.”
14
refute “Don’t incline to refute me and be patient to listen to me.”
15
migrate “Our ancestors migrated from a very far place to here.”
16
beam “I want the girl to beam because the smile of her is just like quantities of beam lighten my heart.”
17
token “There are lots of tokens so you don’t need money.”
18
navie “Confronted with love we are all navie.”
19
corporate “As a person recruited by us you are supposed to think of the corporate interests.”
20
emit “Your smile can emit the beam.”
1
curl “Her hair is curled and she has the curl.”
2
funeral “I won’t attend your funeral because it would make me sad.”
3
postpone “This meeting is postponed so we should attend this meeting on next monday.”
4
polish “He polish his sword so that it is shiny.”
5
wander “Don’t wander in front of me, which will distract my attention.”
6
deputy “She is the deputy of the policeman and she is a good assistant.”
7
coward “He almost fears everything so he is a coward.”
8
missile “If the country still offends the pride of China, we will treat it with missile.”
9
allege “If you allege something, you assert something.”
10
destiny “Don’t just accept the destiny and fight back!”
11
facilitate “This strategy of our country facilitate our life, which means that it makes our life more convenient.”
12
revolve “Let us revolve this sphere and you won’t feel anything changed.”
13
doctrine “Do you want the priest to tell you our doctrine that ‘God always right’.”
14
distinct “The difference among these distinct species is distinct.”
15
ascertain “We are supposed to ascertain the truth of this case.”
16
symptom “If a doctor want to diagnose what’s disease she has, he will be based on the symptom.”
17
drift “We stay on the small boat and drift in the river.”
18
inherent “Kindness is an inherent characteristic of her which forms from the experience of her life so it won’t change.”
19
aggravate “Somking will aggravate the disease in her lung.”
20
alternative “There are lots of women one of whom will become an alternative of her.”
我在大二下选修了计算机组成原理,这篇blog用来梳理相关知识点
一些学习计算机组成原理之前应该知道的知识…
最后请谨记,该门课特点是概念抽象、繁琐…但是“清澈见底”,只要想弄清楚,一定可以!
这一部分主要是一些有关指令系统设计的知识点
于是,首先看看这三个知识点:
接下来,一条指令应该包含以下信息:
按地址数的指令分类:
零、一、二、三、多地址指令,其中二三是典型的RISC风格。三的特点是显示指定了dst,一或二的dst是隐含的(built-in or src)
指令执行的阶段:
取指令->译码->取操作数->运算(执行)->存放结果->取下一条指令
不一定所有指令都涉及所有步骤,但是考虑的时候应该按最复杂的来,何尝不是一种设计原则?
指令设计基本原则:
完备性,兼容性,均匀性,可扩展性
应当明白词语背后的含义
最简单的完备指令系统:
load, store, inc, brn
…
…
这是涉及关于如何给操作码编码,以及对应数量关系的问题。
核心思想是一种数字状态,一个编码
涉及到的相关信息有:
通常会已知1、2,4中某些地址数的指令条数,求剩余一种地址数的指令最多的条数。
关键点是:
尤其关注RISC的风格。
RISC是load/store型指令系统,特点是只有load、store命令才能访问存储器,其它运算类的指令通通不能访问存储器
(值得注意的是这种指令系统,属于通用寄存器型指令系统的子集,特点是使用通用寄存器存放临时数据,而不使用累加器)
RISC的特点是:
…
这一部分是重点知识,所以会有多级的副标题。请着重掌握!
一些零碎的知识点,不好纳入后面的各级标题之中,于是集中在此...
或者说并非无法纳入,而是比较重要...单拎出来也方便记忆
MIPS中指令格式包括R型、I型、J型。
机器级表示:
op6+rs5+rt5+rd5+shamt5+func5
注意,R型指令助记符表示的时候,实际上是 op rd rs rt的顺序,要和机器级位置区分开不妨考虑一下,op全为0的好处是什么
机器级表示:
op6+rs5+rt5+imm16
addi $2,$1,imm
lw $2,100($1)
beq $1,$2,L
机器级表示:
op6+target address26
MIPS使用32个通用寄存器,我们应该掌握以下有关的知识。
程序中一般都使用名称,举例子的时候使用编号多一些。或许,前者体现“助记”,后者体现“通用”
记忆的方法,“一个过程调用,使用了4个参数a,返回了两个值v,调用者保存了8个寄存器s,被调用者保存了10个寄存器t,关键在于sp、fp与ra”+zero
在开始做这一部分的笔记之前,我思考了一个问题——如何才能更好的记忆MIPS汇编指令。我得出的答案是,一般性规律的记忆+特殊性个例的记忆。对于一般性规律的记忆,其规律包括:指令的助记符+什么类型的指令+指令的特性,前两者可以帮助我们正确地写出指令,后者可以帮助我们正确地理解指令。对于特殊性个例,我们不妨记住全部。在记忆的过程中带着这个思想,或许会容易记忆一些。
我们接下来按照指令的类别进行。
一般性的规律是,助记符:加、减有I、R型,乘、除只有R型
存储访问,按照访存字节,分为按字(word,MIPS中是32位),按半字(half word,16位),按字节(byte,8位)访问
以及MIPS最有特色的指令lui
一般性的规律是,助记符;访存都是I型
MIPS中常用的逻辑运算是与、或、异或
一般性的规律是,助记符,逻辑运算有R型、I型
MIPS中的涉及的移位操作有,逻辑左、右移动,算术右移
一般性的规律是,助记符,移位只有I型
MIPS中涉及的条件分支,常用的是,slt、beq、bne
一般性的规律是,助记符,slt有R型,其它都是I型
MIPS中的跳转指令常用的是j、jr、jal
一般性的规律是,助记符,都是J型
这一小节主要是掌握MIPS汇编代码的编写的常见结构,包括分支结构、循环结构、还有过程调用
分支结构主要有如下两种:
if(i == j) or if(i != j)
这种等或不等,主要使用bne
,beq
进行if(i < j)
这种大于小于的关系,主要使用slt
与bne
,beq
进行下面是两个例子
eg1
if(i == j)f = g + h
else f = g - h
$s1<-i $s2<-j $s3<-f $s4<-g $s5<-h
start: bne $s1,$s2,else
add $s3,$s4,$s5
j exit
else: sub $s3,$s4,$s5
exit: ...
这里以while循环为例
eg
while(i != k){
x = x + a[i];
i = i + 1;
}
$s1<-x $s2<-i $s3<-k $s5<-a
loop: beq $2,$3,exit
sll $s7,$s2,2 #注意这行,不能直接将i <<= 2 (Bits -> Byte)
add $s7,$s5,$s7
lw $s6,0($7)
add $s1,$s1,$s6
addi $s2,$2,1 #注意这行,i = i+1(Bits)
j loop
exit: ...
注意,将偏移i换算为地址的时候要乘4,换算成对应按字节编址的情况
首先我们应该清楚整个过程调用的执行过程:
注意,以上是以最严格、完整的过程来叙述的,实际上都是根据需要来进行。于是可以做以下几点说明:
以上几点说明都是针对最开始描述的每一点过程进行的
下面还有一些需要补充的点
Ⅰ MIPS中栈帧是由高地址到低地址,这意味着分配栈帧空间对$sp执行的是减法操作
eg:在栈帧中分配空间,保存$ra,$a0
subi $sp,$sp,8
sw $a0,4($sp)
sw $ra,0($sp)
Ⅱ 一般只有在由数组或结构体等占用空间较大的复杂数据结构的时候才需要使用栈帧分配局部变量($t不够用)
Ⅲ Q没有进一步嵌套调用其它函数的情况,Q被称为叶子过程。一般的叶子过程通常在MIPS中甚至不需要开辟栈帧,因为有足够多的通用寄存器
Ⅳ 如果$fp不使用(建立当前函数的栈帧时并没有维护$fp),可以将$fp作为$s8来使用
Read more -->由于我在学习机器学习算法的时候,希望通过Python来对相关的算法进行复现。而自己在此之前其实零零散散不成体系地接触过Python语言,也了解一些基本的东西,但是对于Python中一些语言“特性”方面的东西所知甚少,例如变量的作用域与生命周期,不同模块间的访问等等;此外我对Python风格的代码写法也并不熟悉,其实写什么感觉都是C的味道......于是写下这篇blog用来记录,进一步对相关内容的学习
__name__是python模块当中的一个内置变量,每个模块都有。如果你选择当前模块开始执行,那么当前模块内置的__name__会被置为__main__;如果一个模块是被令一个模块import进去的,那么这个模块的__name__会被置为__模块名__,但是不会引入后缀。
通过__name__我们就可以将我整个项目文件模块化的组织起来。将一个模块作为程序的执行入口,并始终自我约束地从这个模块开始启动整个项目程序。这样做的关键在于使用如下代码:
1 | def main: |
关键点即,不要使用判断__name__以外的任何顶层代码
Python是一种解释性语言,特点就是不需要编译,而是在运行时通过解释器逐行读取、分析和执行源代码。对应的特点之一就是交互式的编程环境(可以在命令行中输入代码,并立刻看到执行的结果)
我联想到与这种特点相对应的就是——“顶层代码”,即相关的语句不会被封装在任何函数和类当中,点击运行,便会至上而下地逐行开始执行。
所以一个关键的特性就是,使用import导入模块化后,该模块的顶层代码会立刻执行。
启示:编写规范化的工程代码时,除了判断程序执行入口,不要使用顶层代码
1 | #k_means.py |
输出结果:
this is k_means this is main
全局变量
在同一模块当中,定义于模块层的变量(顶层代码部分),对应的是global varible
全局变量,这些变量的作用域是全局可见,生命周期是从程序开始执行开始,执行完毕结束。
局部变量
定义于函数中的变量是local varible
局部变量,作用域局部可见。对于嵌套函数,外层变量对内层可见,内层对外层不可见。在Python中这种函数嵌套更加的显然。下面是一个例子:
1 | def outer_function: |
对应变量的生命周期,都是从定义自己的函数开始,到函数执行完毕结束。
另外值得一提的是,在上面这个例子当中,inner_function不能从顶层代码调用。
3. 内置变量Built-in varible
内置变量的作用域是在任何地方都可以访问,且生命周期贯穿整个程序的运行期,最开始提到的__name__就是一个很好的例子。
4. 访问规则
python对于变量遵循LEGB
的访问规则,即局部、嵌套、全局、内置。当发现了变量,即刻使用。
最后简单补充以下Python的变量定义规则,变量在“第一次赋值”时被定义。当然这意味着我们要定义一个变量必须考虑一个初始值,如果暂时没有初始值的话可以使用None
作为初始值。随后根据需要赋予想要的初始值即可。当然,变量的类型也是根据你赋予的值来确定的。
为了理解多模块情况下相关变量的作用域和生命周期,引入以下概念:
模块对象,在导入模块的时候Python会为模块创建一个对象,这个对象的生命周期由其作用域确定
全局导入,模块对象在全局作用域中导入,此时模块变量生命周期同程序一样。作用域同全局变量。
局部导入,模块对象在局部作用域中导入,此时模块变量生命周期同导入了它的函数。作用域同相应的局部变量。
模块中的顶层代码在被导入时会立刻执行,相应的对应的全局变量会即刻创建,所以对应的全局变量生命周期、作用域,同模块对象。
口语化的来说,模块被导入的时候也相当于一个变量(或者是一个类),如果是被主函数所在的模块作为全局变量导入,那么被导入模块的生命周期、作用域同全局变量,如果被作为局部变量导入,也同局部变量。相应的,被导入的时候,被导入模块中的“全局变量”也会即刻被创建,其生命周期同被导入的模块。(毫不精准的表述…)
在使用以下代码的时候,名称冲突时常发生。
1 | from somemodule import somename |
这类似是跳过了模块对象,直接导入了其中某个全局变量,自然就很可能与当前模块已有的全局变量、函数发生名称冲突。
常用的解决方法,也是我们使用模块化的常用方法如下:
1 | import somemodule |
语法如下:
1 | list_name = [formula for _ in range(start, end)] |
这种创建列表的方法成为列表生成式,formula是用于生成列表的表达式,可以是返回一些值的函数,后面的循环是列表中生成元素的次数,循环一次便会调用一次formula。
当然formula也可以直接是数学表达式,例如第二个例子展示的,用于生成1到9的平方的列表。
注意end不被包含在内。
元组(Tuple)是一种内置的数据结构,属于不可变序列类型,用于存储多个元素。与列表(List)不同,元组一旦创建,其内容就不能更改(即不可变)。元组常用于存储一组相关的数据,例如函数返回多个值时,可以使用元组来打包这些值。
使用()
来创建一个元组,例如:
1 | # 创建一个空元组 |
在python中函数的参数不需要提前声明类型,同样的返回值也不需要提前进行声明。但是在大型的项目中为了便于程序的维护,以及提供静态的检查,可以使用注解符号。例如,下面这个例子。
1 | from typing import List, Tuple |
其中typing是类型注解使用的包,如果不需要使用类型进行注解可以不使用这个包。
常见的类型注解有:
我从现在开始学习聚类相关的内容,最终目标是希望发表一篇相关的论文。我以现在浅显的眼光给自己定下的学习路线如下
1. 完成西瓜书聚类部分的学习,完成的标志是将书上给出的伪代码进行真实地复现
2. 阅读综述论文,了解聚类对应的科研领域当前大概的情况
3. 阅读聚类有关的顶会论文......
我以现在的知识,无法继续制定下面的计划了,因为我并不了解3、往后的真正开始着手科研工作会是怎样的。我目前粗浅的想法是,或许我会了解到一些聚类的具体应用,然后为了完成一篇相关的论文:我也必须将聚类投入到具体的应用当中去,这个时候我不得不学习一些其它领域的知识(当然,目前我并不清楚那些会是什么);又或许我会做一些对聚类算法进行改进的工作,但是这或许会更加艰难(因为曾经一位厉害的学长告诉我将A运用于B会比将A升级为A+简单许多)
此外,我将这篇blog用作自己的学习日志与计划路线
2025.4.30
简单地说,就是要对一个n维向量元素的集合求一个划分,划分后的子集就是一类的(不相交的簇)。
对于数据集$D = {x_1,x_2,…,x_m}$,划分为k个不相交的集合$C_1, C_2, …, C_k$,若$x_i \in C_j$,则$\lambda_i = j$,其中$j \in {1,2,…,k}$,对应$\lambda_i$就是$x_i$的标签。聚类任务要做的是就是求出一个聚类结果$\lambda = (\lambda_1,\lambda_2,…,\lambda_m)$,其中$\lambda$为数据集的簇标记向量,第$i$个分量标记了$x_i$属于哪一个簇。
怎样的聚类是好的:
外部指标:将聚类结果和某个“参考模型”进行比较,称为外部指标
对于数据集$D = {x_1,x_2,…,x_m}$,使用聚类模型A,得到簇标记向量$\lambda$,另外使用参考聚类模型B,得到簇标记向量$\lambda^{*}$。
于是我们可以根据$\lambda_i$与$\lambda_j$相同与否的关系以及$\lambda^{}_i$
与$\lambda^{}_j$是否相同的关系定义如下集合。
$DD,DS,SD,SS$一共四个集合,这些集合中的元素类似$(x_i,x_j)$,是一个“向量对”,分别按照如下规则界定类似的向量对是否属于相应的集合
D即different,S即same 这样就非常容易理解了
根据上面的集合,我们可以定义如下过度变量
进一步,我们定义常用于性能度量的第一组系数
这些性能指标的范围都是$[0,1]$,并且越大说明聚类效果越好当然,前提是参考的模型是“正确”的
定义函数$dist(\cdot,\cdot)$,用于计算两个向量的距离。则它应该满足下述三个性质
常用的距离是闵可夫斯基距离
$dist_mk(x_i,x_j) = (\sum_{\mu = 1}^{n} \lvert x_{i\mu} - x_{j\mu} \rvert ^{p})^{\frac{1}{p}}$
显然当$p = 2$时即我们常用的欧氏距离,$p = 1$时为曼哈顿距离
在考虑属性之间的距离的时候,序十分重要。这里通过简单的例子引入有序和无序。属性值出自于能够直接计算距离的属性称为有序属性,例如属性定义域为${1,2,3}$,而不能的就是无序属性,例如${货车,西瓜,乐乐}$。
显然,闵可夫斯基距离是用于衡量有序属性的距离的。
假设有$k$个样本簇,$m_\mu,a$表示在属性$\mu$上取值为$a$的样本的个数,$m_\mu,a,i$表示在第i个样本簇中,属性$\mu$取值为$a$的样本个数。定义VDM如下。
$VDM = \sum_{i=1}^{k} \lvert \frac{m_\mu,a,i}{m_\mu,a} - \frac{m_\mu,b,i}{m_\mu,b}\rvert ^{p}$
值得注意的是,这里衡量的只是无序属性的距离,而要衡量两个无序样本$x_i$与$x_j$的距离,即其中的各个属性(类比向量的分量)都是无序属性,我们应该对各个属性的$VDM$求和。
不失一般性,我们可以定义混合元素的距离如下:
$MinkovDM_p(x_i,x_j) = (\sum_{\mu=1}^{n_c} \lvert x_{i\mu} - x_{j\mu} \rvert ^{p} + \sum_{\mu=n_c+1}^{n} VDM_p(x_{i,\mu},x_{j,\mu}))^{\frac{1}{p}}$
其中$x_i,x_j$为混合属性的元素,$1到n_c$对应为有序属性,$n_c到n$对应为无序属性
于是我们可以根据元素的不同(有序、无序、混合),选取我们需要的距离函数$dist(\cdot,\cdot)$,定义如下常用于刻画簇的性质的量
进一步我们定义一些内部指标如下。
DB指数越小越好,Dunn指数越大越好
原型的概念对应的是空间中的点。原型聚类的前提是认为,数据集中的聚类结构可以通过一组原型来描述。而原型聚类要做的就是通过某些方法找出这组“原型”。常见的原型聚类算法的代表有k-means(k均值算法)
、学习向量量化算法
等等
在后续的blog中会记录我复现相关算法的过程
Read more -->