GOAL: suggest similar from note as I'm writing it.
How close am I? Start with this doc: Research vs prototyping: research is deeper.
ARCHITECTURE:
Link → representation (ping BERT as a service)
Embed all my Notion documents
BOOM!!! Amazing progress. Tomorrow: make a script that returns similarity for a document. Maybe a Bitbar plugin too?
Representation → similar (vector similarity).
BertClient connects over localhost:5555, which is forwarded to the server. Connection secured.
Bert server then tries to communicate back over 5556. To where? Localhost? A return IP? I have 5556 remotely forwarding back to local 5556.
OK - BERT server receives a connection.
Trying to debug with https://jvns.ca/networking-zine.pdf sudo tcpdump port 5556 -w thing.pcap
, but getting no packets. What am I misunderstanding? Read tcpdump, perhaps, and how it interacts with SSH forwarding.
OK - so I actually am receiving a bunch of traffic on 5556, but wait - is remotely forwarding to 5556 blocking my port here? No, I don't think so. 5555 has a listener on it, but not 5556.
Wait, I think it's because BertClient isn't listening on 5556? What is it supposed to be doing? Go read the code, and figure out what out_port does?
Next: I'm a bit stumped. Think about what could be going wrong. Seems like server is sending things back, but it's not getting to the client.
- RemoteForward 5556.
- Neither.
- LocalForward 5556.
Traceback (most recent call last):
File "/Users/jasonbenn/.pyenv/versions/worldview/lib/python3.7/site-packages/bert_serving/client/__init__.py", line 211, in arg_wrapper
return func(self, *args, **kwargs)
File "/Users/jasonbenn/.pyenv/versions/worldview/lib/python3.7/site-packages/bert_serving/client/__init__.py", line 237, in server_config
return jsonapi.loads(self._recv(req_id).content[1])
File "/Users/jasonbenn/.pyenv/versions/worldview/lib/python3.7/site-packages/bert_serving/client/__init__.py", line 169, in _recv
raise e
File "/Users/jasonbenn/.pyenv/versions/worldview/lib/python3.7/site-packages/bert_serving/client/__init__.py", line 157, in _recv
response = self.receiver.recv_multipart()
File "/Users/jasonbenn/.pyenv/versions/worldview/lib/python3.7/site-packages/zmq/sugar/socket.py", line 475, in recv_multipart
parts = [self.recv(flags, copy=copy, track=track)]
File "zmq/backend/cython/socket.pyx", line 791, in zmq.backend.cython.socket.Socket.recv
File "zmq/backend/cython/socket.pyx", line 827, in zmq.backend.cython.socket.Socket.recv
File "zmq/backend/cython/socket.pyx", line 191, in zmq.backend.cython.socket._recv_copy
File "zmq/backend/cython/socket.pyx", line 186, in zmq.backend.cython.socket._recv_copy
File "zmq/backend/cython/checkrc.pxd", line 19, in zmq.backend.cython.checkrc._check_rc
zmq.error.Again: Resource temporarily unavailable
Traceback (most recent call last):
File "/Users/jasonbenn/.pyenv/versions/worldview/lib/python3.7/site-packages/bert_serving/client/__init__.py", line 211, in arg_wrapper
return func(self, *args, **kwargs)
File "/Users/jasonbenn/.pyenv/versions/worldview/lib/python3.7/site-packages/bert_serving/client/__init__.py", line 237, in server_config
return jsonapi.loads(self._recv(req_id).content[1])
File "/Users/jasonbenn/.pyenv/versions/worldview/lib/python3.7/site-packages/bert_serving/client/__init__.py", line 169, in _recv
raise e
File "/Users/jasonbenn/.pyenv/versions/worldview/lib/python3.7/site-packages/bert_serving/client/__init__.py", line 157, in _recv
response = self.receiver.recv_multipart()
File "/Users/jasonbenn/.pyenv/versions/worldview/lib/python3.7/site-packages/zmq/sugar/socket.py", line 475, in recv_multipart
parts = [self.recv(flags, copy=copy, track=track)]
File "zmq/backend/cython/socket.pyx", line 791, in zmq.backend.cython.socket.Socket.recv
File "zmq/backend/cython/socket.pyx", line 827, in zmq.backend.cython.socket.Socket.recv
File "zmq/backend/cython/socket.pyx", line 191, in zmq.backend.cython.socket._recv_copy
File "zmq/backend/cython/socket.pyx", line 186, in zmq.backend.cython.socket._recv_copy
File "zmq/backend/cython/checkrc.pxd", line 19, in zmq.backend.cython.checkrc._check_rc
zmq.error.Again: Resource temporarily unavailable
YUP! Makes sense, because both connections were initiated on the client. Awesome.
Well - I got something...!
Holy COW I can't believe how much better cosine similarity is than a simple dot product (for NLP - image interpretability research uses dot products). Cosine similarity normalizes their lengths, right?
These suggestions are just so awesome.